WORLD INTELLECTUAL PROPERTY ORGANIZATION 
International Bureau 




INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) International Patent Classification 6 : 

C07K 14/705, C12N 15/12, A61K 38/17, 
C12Q 1/68 


Al 


(11) International Publication Number: WO 99/00422 
(43) International Publication Date: 7 January 1999 (07.01.99) 


(21) International Application Number: PCT/US98/ 13680 

(22) International Filing Date: 30 June 1998 (30.06.98) 

(30) Priority Data: 

60/05 1 .284 30 June 1997 (30.06.97) US 

(71) Applicant: PRESIDENT AND FELLOWS OF HARVARD 

COLLEGE [US/US]; 124 Mt. Auburn Street, Cambridge, 
MA 02138 (US). 

(72) Inventors: BUCK, Linda; Apartment 2, 27 Dartmouth Street, 

Boston, MA 02116 (US). DULAC, Catherine; 10 DeWolfe 
Street #48, Cambridge, MA 02138 (US). HERRADA, 
Gilles; 153 Salem Street, Boston, MA 02113 (US). MAT- 
SUN AMI, Hiroaki; 15 University Road #31, Brookline. MA 
02146 (US). 

(74) Agent: PLUMER, Elizabeth, R.; Wolf, Greenfield & Sacks, 
P.C., 600 Atlantic Avenue, Boston, MA 02210 (US). 


(81) Designated States: CA, JP, European patent (AT, BE, CH f CY, 
DE, DK, ES, FI, FR, GB, GR, IE, IT, LU. MC, NL, PT, 
SE). 

Published 

With international search report. 
With amended claims. 



(54) Title: NOVEL FAMILY OF PHEROMONE RECEPTORS 



(57) Abstract 



The invention describes a multigene family encoding a collection of novel mammalian pheromone receptors. Nucleic acids encoding 
the pheromone receptor polypeptides, including fragments and biologically functional variants thereof are provided. Also included are 
polypeptides and fragments thereof encoded by such nucleic acids, and antibodies relating thereto. Methods and products for using such 
nucleic acids and polypeptides also are provided. 



FOR THE PURPOSES OF INFORMATION ONLY 
Codes used to identify States party to the PCT on the front pages of pamphlets publishing international applications under the PCT. 



AL 


Albania 


ES 


Spain 


LS 


Lesotho 


SI 


Slovenia 


AM 


Armenia 


FI 


Finland 


LT 


Lithuania 


SK 


Slovakia 


AT 


Austria 


FR 


France 


LU 


Luxembourg 


SN 


Senegal 


AU 


Australia 


GA 


Gabon 


LV 


Latvia 


SZ 


Swaziland 


AZ 


Azerbaijan 


GB 


United Kingdom 


MC 


Monaco 


TD 


Chad 


BA 


Bosnia and Herzegovina 


GE 


Georgia 


MD 


Republic of Moldova 


TG 


Togo 


BB 


Barbados 


GH 


Ghana 


MG 


Madagascar 


TJ 


Tajikistan 


BE 


Belgium 


GN 


Guinea 


MK 


The former Yugoslav 


TM 


Turkmenistan 


BF 


Burkina Paso 


GR 


Greece 




Republic of Macedonia 


TR 


Turkey 


BG 


Bulgaria 


HU 


Hungary 


ML 


Mali 


TT 


Trinidad and Tobago 


BJ 


Benin 


IB 


Ireland 


MN 


Mongolia 


UA 


Ukraine 


BR 


Brazil 


IL 


Israel 


MR 


Mauritania 


UG 


Uganda 


BY 


Belarus 


IS 


Iceland 


MW 


Malawi 


US 


United States of America 


CA 


Canada 


IT 


Italy 


MX 


Mexico 


uz 


Uzbekistan 


CF 


Central African Republic 


JP 


Japan 


NE 


Niger 


VN 


Viet Nam 


CG 


Congo 


KB 


Kenya 


NL 


Netherlands 


YU 


Yugoslavia 


CH 


Switzerland 


KG 


Kyrgyzstan 


NO 


Norway 


zw 


Zimbabwe 


CI 


Cote d' Ivoire 


KP 


Democratic People*! 


NZ 


New Zealand 






CM 


Cameroon 




Republic of Korea 


PL 


Poland 






CN 


China 


KR 


Republic of Korea 


PT 


Portugal 






CU 


Cuba 


KZ 


Kazakstan 


RO 


Romania 






CZ 


Czech Republic 


LC 


Saint Lucia 


RU 


Russian Federation 






DB 


Germany 


U 


Liechtenstein 


SD 


Sudan 






DK 


Denmark 


LK 


Sri Lanka 


SB 


Sweden 






BE 


Estonia 


LR 


Liberia 


SG 


Singapore 







WO 99/00422 PCT/US98/13680 
NOVEL FAMILY OF PHEROMONE RECEPTORS 

5 Field of the Invention 

This invention relates to nucleic acids and encoded polypeptides which are part of a 
multigene family encoding a collection of novel mammalian pheromone receptors. The 
invention further provides representative nucleic acids and encoded polypeptides in this 
multigene family. The representative polypeptides are expressed in the murine and rat 
10 vomeronasal organ (VNO). Agents which bind the nucleic acids or polypeptides also are 
provided. The invention further relates to methods of using such nucleic acids and polypeptides 
in the diagnosis and/or treatment of disease, including the use of these molecules in controlling 
fertility and behavior in vertebrates and invertebrates. 

15 Background of the Invention 

Pheromones are intraspecific chemical signals found throughout the animal kingdom. 
They regulate populations of animals by inducing innate behaviors and stereotyped changes in 
physiology (Karlson and Luscher, Nature, 1959,183:55-56- Wilson, Sci. Am., 1963,208:100- 
114; Sorensen, Chem. Sens., 1996, 21:245-256). Pheromones can serve as cues for 

20 overcrowding, impending danger, reproductive status, gender, or dominance. In rodents, a 
variety of pheromone effects have been reported. These include effects on estrus and the onset 
of puberty as well as the induction of mating and aggressive behaviors (Singer, A.G., J. Steroid 
Biochem. Molec. Biol., 1991, 39:627-632; Halpern, M.,Ann. Rev. NeuroscL, 1987 10:325-362; 
Wysocki, C.J., et al., In the Neurobiology of Taste and Smell, 1987, 125-150; Novotny et al., 

25 Chemical signals in Vertebrates, 1990, Vol. 5, eds. D.W. Macdonald et al., Oxford University 
Press). 

The detection of pheromones is mediated by the olfactory system. However, sensory 
neurons that detect pheromones are typically segregated from those that detect volatile odorants 
(Keverne, E.B., Trends NeuroscL, 1983, 6:381-384; Halpern, M.,Ann. Rev. NeuroscL, 1987, 
30 10:325-362; Wysocki, C.J., et al., In the Neurobiology of Taste and Smell, 1987, 125-150; 
Hildebrand, J.G., et al., Brain Res., 1997, 677:157-161). In mammals, sensory neurons in the 
nasal olfactory epithelium (OE) detect volatile odorants and some pheromones while those in an 
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accessoiy olfactory organ, called the vomeronasal organ (VNO), are thought to be specialized 
to detect pheromones. The VNO is a tubular structure, at the base of the nasal septum, which is 
connected to the nasal cavity by a small duct. Signals from the OE are relayed through the 
olfactory bulb (OB) to the olfactory cortex, and then to multiple brain regions, including those 
5 involved in conscious perception. In contrast, signals from the VNO are conveyed through the 
accessory olfactory bulb (AOB) to the amygdala and hypothalamus, areas associated with the 
endocrine and behavioral responses induced by pheromones. 

Volatile odorants are detected in the OE by as many as 1000 different types of odorant 
receptors (ORs), which are differentially expressed by olfactory sensory neurons (Buck and Axel, 

10 Cell, 1991, 65:175-187; Levy, N.S., et al., J, Steroid Biochem. Mol. Biol, 1991, 39:633-637, 
1991; Nef, P., et al., Proa Natl Acad. Set, 1992, 89:8948-8952; Strotman, J., et al., 
Neuroreport, 1992, 3:1053-1056; Ngai, J., et al., Cell, 1993, 72:667-680; Ressler, K.J., et al, 
Cell, 1993, 73:597-609; Vassar, R., et al, Cell, 1993, 74:309-318. The ORs are thought to 
couple to the G protein a subunit, Ga^ thereby initiating a cascade of transduction events which 

15 culminate in the generation of action potentials in the sensory axons (reviewed in Firestein, S., 
Curr.Opin. in Neurobiology, 1992, 2:444-448; Reed, R., Neuron, 1992, 8:205-209; Ronnett, G., 
et al., Trends Neurosci, 1992, 15:508-513). Current evidence suggests that each OR may 
recognize a particular molecular feature that can be shared by many odorants (Ressler, K., et al., 
Cell, 1994, 79:1245-1255; Vassar, R., et al., Cell, 1994, 79:981-991; Axel, R., Scl Am., 1995, 

20 1273:154-159; Buck, L., Anna. Rev. Neurosci, 1996, 19:517-544). This is consistent with a 
combinatorial coding model in which the identities of different odorants are encoded by different 
combinations of receptors, but each receptor serves as one component of the codes for many 
odorants. By contrast, very little is known about how pheromones are detected or encoded in 
the VNO. Although VNO neurons (VNs) resemble olfactory sensory neurons in the nose, only 

25 a rare VN expresses an OR gene. VNs also lack a number of other olfactory sensory 
transduction molecules, including the G protein a subuni^GOotf (Reed, R., Neuron, 1992, 8:205- 
209), which is highly expressed in olfactory neurons (Dulac and Axel, Cell, 1995, 83:195-206; 
Berghard, A., et al, Proc. Natl Acad Scl USA, 1996, 93:2365-2369; Wu, Y„ et al, Biochem. 
Biopys. Res. Com., 1996, 220:900-904). Instead, VNs express high levels of two other G 

30 protein a subunits,Gao and Gai 2 (Dulac and Axel, Cell, 1995, 83:195-206; Halpem, M., Brain 
/to.,41995,677:157-161;Berghani,A., et al, Proc. Natl Acad. Scl USA, 1996,93:2365-2369). 
and Gai 2 are expressed in spatially-segregated subsets of VNs that form longitudinal zones 
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in the VNO neuroepithelium. Interestingly, Dulac and Axel have identified a family of -100 
candidate pheromones receptors ("VNRs") which appear to be expressed exclusively in the Gai 2 
subset (Dulac and Axel, Cell, 1995, 83:195-206). 

This invention differs from the state of the art in providing a novel family of mammalian 
5 pheromone receptors. Accordingly, the objects of the invention relate to providing compositions 
containing these novel receptors and their binding partners and methods for using such 
compositions to modulate pheromone receptor activity. 

Summary of the Invention 

10 The invention involves the discovery of a multigene family of mammalian pheromone 

receptors. In particular, the invention involves the cDNA cloning of multiple pheromone 
receptors from a murine VNO cDNA library and from a rat VNO cDNA library. Partial 
sequences of human homologs of these pheromone receptors also are provided. 

In general, the invention provides isolated nucleic acid molecules encoding the novel 

1 5 pheromone receptors, unique fragments of the isolated nucleic acid molecules, expression vectors 
containing the foregoing, and host cells transfected with the foregoing. The invention also 
provides isolated pheromone receptor polypeptides and agents which bind such polypeptides, 
including antibodies. The foregoing can be used in the diagnosis or treatment of conditions, 
including the control of fertility, that are characterized by the expression of a pheromone receptor 

20 polypeptide. Methods for identifying pharmacological agents useful in the diagnosis or 
treatment of such conditions and methods for identifying additional members of this multigene 
family also are provided. 

Applicants have discovered that the pheromone receptors disclosed herein are expressed 
in the vomeronasal organ (VNO), particularly in Goto protein expressing neurons. This is in 

25 contrast to the prior art VNO pheromone receptors which are expressed in neurons which express 
different G-coupled proteins (Gai 2 -expressing neurons). Thus, the novel pheromone receptors 
disclosed herein are distinct from, and expressly exclude, the prior art VNO pheromone receptors 
which differ in primary structure, as well as in cell localization. Although Applicants do not 
intend the invention to be limited to a particular theory or mechanism, the amino acid sequence 

30 homology and structural organization of the pheromone receptor polypeptides to other well- 
known G-protein coupled receptors suggests that the pheromone receptors disclosed herein also 
are G-protein coupled. Thus, it is anticipated that the binding to the pheromone receptor of its 
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cognate ligand (pheromone) will be accompanied by G-protein signal transduction, an event 
which can be measured using conventional screening assays, such as assays that measure changes 
in the intracellular concentrations of calcium and/or cyclic nucleotides (see, e.g., PCT 
publication no. WO 94/18959, entitled "Calcium Receptor- Active Molecules", inventors E. 
5 Nemeth et al.). 

According to one aspect of the invention, afamily of pheromone receptor polypeptides 
is provided. Each polypeptide of the family shares amino acid sequence homology and structural 
organization with a pheromone receptor polypeptide selected from the group consisting of SEQ 
ID NO. 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 

10 and 52. Each polypeptide member of the receptor family contains, from amino terminus to 
carboxyl terminus, the following domains: (a) an amino-terminal extracellular domain containing 
from 30 to 600 amino acids; (b) a transmembrane region comprising: (i) seven non-contiguous 
transmembrane domains designated TM1 , TM2, TM3, TM4, TM5, TM6 and TM7, (ii) three non- 
contiguous extracellular domains designated EC2, EC3 and EC4, and (iii) three non-contiguous 

15 intracellular domains designated IC1, IC2, and IC3, wherein the transmembrane domains, the 
extracellular domains and the intracellular domains are attached to one another from amino 
terminus to carboxyl terminus in the order TM 1 -IC 1 -TM2-EC2-TM3- IC2-TM4-EC3-TM5-IC3- 
TM6-EC4-TM7, and wherein the tnmsmembrane region has at least about 35% homology and 
a length approximately equal to a transmembrane region of a polypeptide selected from the group 

20 consisting of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 34, 36, 38, 40, 42, 44, 46, 48, and 50; and (c) 
a carboxyl-terminal intracellular domain containing from 5 to 200 amino acids. Each 
polypeptide member of the family is expressed in a Ga 0 protein-expressing vomeronasal organ 
neuron or are expressed in another olfactory organ neuron in an animal which does not possess 
a vomeronasal organ. One skilled in the art can readily identify olfactory organs in animals 

25 which do not possess a vomeronasal organ. 

In general, the amino-terminal extracellular domains (NTDs) of the receptor family 
members share sequence homology to a pheromone receptor polypeptide selected from the group 
consisting of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 16, 1 8, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 
42, 44, 46, 48, and 50 to a lesser extent than that observed for the transmembrane region. The 

30 length of the extracellular domain can vary among members of the family. Accordingly, certain 
embodiments of the invention have extracellular domains that contain at least 50, 100, 200, 300, 
400 or 500 amino acids. Preferably, the transmembrane region has greater than 40% homology 
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with the corresponding region of a pheromone receptor polypeptide selected from the group 
consisting of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 34, 36, 38, 40, 42, 44, 46, 48, and 50, and more 
preferably, have even greater sequence homology (e.g., more than 50%, 60%, 70%, 80% or 90% 
homology). The length of the carboxyl-terminal intracellular domain can vary among members 
5 of the family. Accordingly, certain embodiments of the invention have carboxyl-terminal 
intracellular domains that contain at least between 5 and 50 amino acids. More preferably, 
carboxyl-terminal intracellular domains contain between 15 and 25 amino acids. 

According to another aspect of the invention, a method for identifying a nucleic acid 
encoding a pheromone receptor is provided. The method involves contacting a mixture of 

10 nucleic acid molecules (genomic library, cDNA library, genomic DNA, RNA, etc.) with at least 
one nucleic acid probe of a nucleic acid selected from the group consisting of: (a) a nucleic acid 
molecule selected from the group consisting of SEQ ID NO. 1, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 
23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 54, and 55 that encodes a pheromone 
receptor polypeptide; (b) a unique fragment of (a); (c) a human homolog of (a) or (b); and (d) a 

15 set of degenerate primers of any of (a), (b) or (c); and identifying the sequences within the 
mixture that hybridize to the probe. Selected fragments of human homologs of a pheromone 
receptor are selected from the group consisting of SEQ ID NO. 51, 53, 54 and 55. In certain 
embodiments, the nucleic acid probe further includes a detectable label to facilitate identification 
of the sequence in the library which hybridizes to the probe. In certain embodiments, the probe 

20 is represented by a pair of degenerate polymerase chain reaction ("PGR") primers that amplify 
a unique fragment of a nucleic acid molecule selected from the group consisting of SEQ ID NO. 
1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 
54, and 55. The meaning of "unique fragment" in reference to a nucleic acid is provided below. 
By "degenerate PCR primers that amplify a unique fragment" is meant degenerate primers which 

25 result in the amplification of a unique fragment following a polymerase chain reaction. 
According to this embodiment, the method for identifying a nucleic acid encoding a pheromone 
receptor polypeptide further involves subjecting a mixture of nucleic acids and the degenerate 
PCR primers to amplification conditions prior to identifying the sequences of the mixture that 
hybridize to the probe and that form part of the amplification reaction products. In some 

30 embodiments the pair of degenerate polymerase chain reaction primers is selected from a 
conserved sequence motif of a pheromone receptor polypeptide. A "conserved sequence motif' 
can be determined using the side-by-side comparison of the amino acid sequences of the different 
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pheromone receptor polypeptides of the invention. Exemplary conserved sequence motifs 
include regions selected from the group consisting of amino acids 191-397, amino acids 565-825, 
amino acids 637-825, amino acids 637-804, amino acids 619-784, of the polypeptide of, for 
example, SEQ ID NO. 2 (VR1). In preferred embodiments, the pair of degenerate polymerase 
5 chain reaction primers is selected from the group consisting of SEQ ID NOs. 60 and 6 1 , SEQ ID 
NOs. 62 and 63, SEQ ID NOs. 64 and 63, SEQ ID NOs. 64 and 65, and SEQ ID NOs. 66 and 67. 

According to yet another aspect of the invention, an isolated nucleic acid molecule is 
provided. The isolated nucleic acid molecule hybridizes under high or low stringency conditions 
to a molecule consisting of a nucleic acid sequence selected from the group consisting of SEQ 

10 ID NO. 1, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 
51, 53, 54, and 55. The invention further embraces nucleic acid molecules that differ from the 
foregoing isolated nucleic acid molecules in codon sequence due to the degeneracy of the genetic 
code. The invention also embraces complements of the foregoing nucleic acids. 

The pheromone receptors of the invention are expressed in the vomeronasal organ or, in 

15 an animal which lacks such an organ, are expressed in another olfactory organ. More 
particularly, the receptors of the invention are expressed in a Goto protein-expressing vomeronasal 
organ neuron. Although not intending to be bound to a particular mechanism, it is believed that 
the receptors of the invention are G-protein coupled receptors. This is supported by Applicants' 
discovery that the receptors of the invention are expressed in Ga 0 protein-expressing 

20 vomeronasal organ neurons. 

The pheromone receptors of the invention bind to ligands (pheromones) which induce 
certain changes in receptor conformation. Methods for identifying ligands which bind to the 
pheromone receptors of the invention are provided below, e.g., by forming an affinity matrix 
containing immobilized receptor and using the matrix to isolate a cognate ligand from a complex 

25 mixture. The particular ligand bound by a particular receptor is dictated by the primary and 
secondary structure of the receptor. In certain embodiments, the immobilized pheromone 
receptor polypeptide is a pheromone receptor polypeptide selected from the group consisting of 
SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 
48, 50 and 52. 

30 According to another aspect of the invention, an isolated nucleic acid molecule that is a 

unique fragment of any of the foregoing isolated nucleic acid molecules is provided. In general, 
the isolated nucleic acid molecule consists of a unique fragment between 12 and 4000 
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nucleotides in length, and complements thereof, of any cDNA (SEQ ID NOs. 1, 3, 5, 7, 9, 11, 
13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 54, and 55) 
encoding a pheromone receptor polypeptide selected from the group consisting of SEQ ID NO. 
2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52. 
5 Depending upon its intended use (e.g., probe, primer), the unique fragment can be between 12 
and 2000, 1000, 500, 250, 100, 50 or 25 nucleotides in length. Preferably, the isolated nucleic 
acid molecule consists of between 12 and 35 contiguous nucleotides of the foregoing cDNAs 
encoding the pheromone receptor polypeptides, or complements of such nucleic acid molecules. 
More preferably, the unique fragment is at least 14, 15, 16, 17, 18, 20 or 22 contiguous 

10 nucleotides of the nucleic acid sequence of the foregoing cDNAs encoding the pheromone 
receptor polypeptides, or complements thereof. Particularly preferred isolated nucleic acid 
molecules are isolated fragments of the foregoing cDNAs which encode one or more of the 
following pheromone receptor polypeptide domains, alone or in combination (e.g., as fusion 
proteins): an amino-terminal extracellular domain, a transmembrane region, and a carboxy- 

15 terminal intracellular domain. In certain embodiments, the unique fragments are a pheromone 
receptor extracellular domain or a pheromone receptor intracellular domain coupled to at least 
one (e.g., 1, 2, 3, 4, 5, 6, or 7) transmembrane domain. 

According to yet another aspect of the invention, an isolated nucleic acid molecule 
comprising a molecule having a sequence selected from the group consisting of SEQ ID NO. 51, 

20 53, 54, 55, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 
91, and 92, that encodes a pheromone receptor polypeptide are provided. This aspect of the 
invention further embraces nucleic acid molecules that differ from these nucleic acid molecules 
in codon sequence due to the degeneracy of the genetic code, and diversity among pheromone 
receptors and complements of foregoing. 

25 According to still other aspects of the invention, an expression vector comprising any of 

the foregoing isolated nucleic acid molecules operably linked to a promoter and host cells 
transformed or transfected with the same also are provided. 

According to another aspect of the invention, an isolated polypeptide encoded by any of 
the above-described isolated nucleic acid molecules is provided. Preferably, the isolated 

30 polypeptide is a pheromone receptor polypeptide that has a pheromone receptor activity or an 
antigenic fragment thereof. As used herein, a pheromone receptor activity refers to the ability 
of the pheromone receptor to selectively bind to its cognate ligand (pheromone) and, optionally, 
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upon binding, to induce signal transduction in a cell that expresses the pheromone receptor. In 
preferred embodiments, the isolated polypeptide comprises a pheromone receptor polypeptide 
having a sequence selected from the group consisting of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 16, 
18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52. 
5 According to yet other embodiments, the isolated polypeptide comprises a polypeptide 

encoded by a nucleic acid which hybridizes under high or low stringency conditions to the 
extracellular domain, transmembrane region and/or intracellular domain of a cDNA sequence 
selected from the group consisting of SEQ ID NO. 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 
29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 54, and 55 that encodes a pheromone receptor 

1 0 polypeptide or fragment thereof. Thus, the invention embraces portions of a pheromone receptor 
polypeptide that may include, for example, an amino-terminal extracellular domain or a carboxy- 
terminal intracellular domain coupled to 1, 2, 3, 4, 5, 6, or 7 transmembrane domains. 
Preferably, such polypeptides or fragments thereof are unique fragments and can function as, for 
example, antigens for making antibodies specific for pheromone receptor family members. 

1 5 Accordingly, the polypeptides of the invention can be used to isolate additional members of the 
pheromone receptor family or, alternatively, can be used to induce in vivo an immune response 
to a pheromone receptor, i.e., can be incorporated into a vaccine preparation. Such vaccine 
compositions are useful for controlling fertility or behavior in an animal by administering to the 
animal, an effective amount of the vaccine to elicit an immune response to the pheromone 

20 receptor. Thus, the invention embraces fragments or variants of the foregoing pheromone 
receptors which exhibit certain detectable activities, e.g., a ligand binding activity, an 
antigenicity activity. In certain embodiments, the isolated polypeptide is encoded by a cDNA 
selected from the group consisting of SEQ ID NO. 51, 53, 54, 55, 68, 69, 70, 71, 72, 73, 74, 75, 
76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, and 92, that encodes a pheromone 

25 receptor polypeptide or one or more of its domains. 

According to another aspect of the invention, there are provided isolated binding 
polypeptides which selectively bind a unique amino acid sequence of a pheromone receptor 
polypeptide or fragment thereof. The isolated binding polypeptide in certain embodiments binds 
to a polypeptide comprising the. extracellular domain and/or 1, 2, 3, 4, 5, 6, or 7 transmembrane 

30 domains of a pheromone receptor polypeptide sel cted from the group consisting of SEQ ID 
NO. 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 
52. 
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The isolated polypeptide preferably binds to a polypeptide consisting of the amino- 
terminal extracellular domain and/or one or more portions of the transmembrane region of a 
pheromone receptor polypeptide sequence selected from the group consisting of SEQ ID NO. 
2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52. 
5 In preferred embodiments, isolated binding polypeptides include antibodies and fragments of 
antibodies (e.g., Fab, F(ab) 2 , Fd and antibody fragments which include a CDR3 region which 
binds selectively to the unique sequences of the polypeptides of the invention). In the preferred 
embodiments, the isolated binding peptides do not bind to pheromone receptors that are 
expressed in vomeronasal organ neurons other than Gcco-protein-expressing neurons. 

10 The invention provides in yet other aspects, isolated nucleic acids or polypeptides of the 

invention that are: (a)" immobilized to an insoluble support (an affinity matrix containing 
immobilized pheromone receptor polypeptide or a unique fragment thereof); (b) associated with, 
covalently coupled to, or encapsulated a drug delivery device (e.g., a microsphere) to effect 
controlled release of the isolated nucleic acid or polypeptide in vivo or in vitro; (c) covalently 

15 coupled to another isolated nucleic acid or protein to form a chimeric molecule; and/or (d) 
labeled with a detectable agent (e.g., a radiolabel, a fluorescent label). Thus, the invention 
provides chimeric molecules containing at least one first structural domain of one pheromone 
receptor polypeptide (e.g., an extracellular domain) coupled to a second structural domain (e.g., 
a transmembrane domain, such as TM1, TM2, etc.) of a different pheromone receptor 

20 polypeptide. The invention also provides a method for isolating a pheromone receptor by (1) 
contacting a composition containing a putative pheromone receptor of the above-described 
family with an affinity matrix containing immobilized binding polypeptide under conditions to 
permit the pheromone receptor to selectively bind to the immobilized binding polypeptide, and 
(2) isolating the polypeptides that bind to the affinity matrix. 

25 According to still another aspect of the invention, pharmaceutical compositions 

containing any of the foregoing compounds of the invention in a pharmaceutically acceptable 
carrier and methods of producing same by placing the compositions in the carrier also are 
provided. 

According to still another aspect of the invention, methods for modulating a pheromone 
30 receptor activity (e.g., a ligand binding activity, a signal transduction activity) in a cell 
(vertebrate or invertebrate) are provided. The cell can be located in vivo or in vitro and the 
methods can be used to down regulate (inhibit) or up regulate (stimulate) the pheromone receptor 
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activity. For example, to inhibit a ligand binding activity, the cell is contacted with an inhibitor 
that can be an isolated binding polypeptide that binds to an extracellular portion of the receptor 
and, thereby, inhibits receptor binding to its cognate ligand. Such binding also can induce 
conformational changes in the receptor that alter the signal transduction activity of the receptor. 
5 The inhibitor can be an isolated antibody (or function equivalent thereof) which binds to an 
epitope located on an extracellular portion (such as EC2, EC3, EC4) of the pheromone receptor 
polypeptide, e.g., an amino-terminal extracellular domain or an "extracellular transmembrane 
region domain", i.e., an extracellular portion of the transmembrane region located between one 
or more transmembrane domains. Alternatively, the inhibitor can be an agent (e.g., an isolated 

10 competitive binding polypeptide) that inhibits receptor-ligand binding. For example, the 
inhibitor can be an isolated fragment of a pheromone receptor (preferably, a soluble fragment), 
which fragment contains a ligand (pheromone) binding site. Other inhibitors can be identified 
in screening assays which test the ability of a putative inhibitor to inhibit pheromone receptor- 
mediated signal transduction or which test the ability of the putative inhibitor to inhibit binding 

15 of a pheromone receptor to its known cognate ligand. Similarly, such screening assays can be 
used to identify molecules which stimulate pheromone receptor-mediated signal transduction. 
Exemplary molecules which stimulate transduction include the naturally-occurring ligands (e.g., 
isolated from a biological source (e.g., urine, vaginal fluid), as well as synthetic ligands obtained 
from a non-biological source (e.g., a combinatorial library). 

20 According to still another aspect of the invention, methods for inhibiting the binding of 

a pheromone having a binding domain to a pheromone receptor polypeptide having a ligand 
binding site that selectively binds to the binding domain are provided. The method involves 
contacting (in vivo or in vitro) the pheromone receptor polypeptide with an agent which binds 
to the ligand binding site under conditions to permit binding of the agent to the receptor. For 

25 example, the agent can be an isolated binding polypeptide that binds to the ligand binding site 
of the pheromone receptor. Thus, the agent can be an isolated antibody (or functionally 
equivalent fragment thereof) which selectively binds to the ligand binding site of the receptor. 
Alternatively, the agent can be a pheromone receptor antagonist, e.g., a molecule that mimics 
the structure of the naturally-occurring ligand but that does not mimic the function (stimulating 

30 the receptor) of the naturally-occurring ligand. Agents which inhibit ligand binding can be 
identified in screening assays which test the ability of a putative binding inhibitor to inhibit 
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binding of a pheromone receptor to its cognate ligand (e.g., pheromone). Such molecules can be 
isolated from a biological source or from a non-biological source. 

According to another aspect of the invention, methods for modulating pheromone 
receptor-mediated signal transduction in a subject are provided. The methods involve 
5 administering to a subject in need of such treatment an agent that selectively binds to any of the 
above-described isolated nucleic acid molecules which encode a pheromone receptor or unique 
fragment thereof, or an expression product thereof, in an amount effective to modulate (down 
regulate or up regulate) pheromone receptor-mediated signal transduction in the subject. 
Exemplary agents include antisense nucleic acid molecules and binding polypeptides. 

10 Thus, according to yet another aspect of the invention, methods are provided for 

identifying lead compounds for an pharmacological agent useful in the diagnosis or treatment 
of a condition associated with pheromone receptor signal transduction activity or otherwise 
generally associated with binding of the receptor to its cognate ligand. Preferably, cells 
expressing intact pheromone receptor polypeptides or portions thereof are vised in the screening 

15 assays for identifying lead compounds which modulate pheromone receptor-mediated ligand 
binding or signal transduction activity. Cells expressing these polypeptides, isolated pheromone 
receptor polypeptides and fragments of these polypeptides which contain the ligand binding site 
can be used in the screening assays for identifying lead compounds which modulate binding of 
the receptor to a known ligand. 

20 The screening methods involve forming a mixture of a pheromone receptor polypeptide 

(as noted above) or fragment thereof containing a ligand binding site; a molecule which is 
known to (1) interact with the foregoing receptor to effect pheromone receptor-mediated signal 
transduction or (2) bind to the ligand binding site of the receptor; and a candidate 
pharmacological agent. The mixture is incubated under conditions which, in the absence of the 

25 candidate pharmacological agent, permit a first amount of pheromone receptor-ligand binding 
or receptor-mediated signal transduction by the known ligand. A test amount of the selective 
binding of the ligand by receptor or of the specific activation of signal transduction is 
determined. Detection of an increase in the foregoing activities in the presence of the candidate 
pharmacological agent indicates that the candidate pharmacological agent is a lead compound 

30 for a pharmacological agent which increases specific activation of pheromone receptor-mediated 
signal transduction or selective binding of the ligand by the ligand binding site of the receptor. 
Detection of a decrease in the foregoing activities in the presence of the candidate 
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pharmacological agent indicates that the candidate pharmacological agent is a lead compound 
for a pharmacological agent which decreases specific activation of pheromone receptor-mediated 
signal transduction or selective binding of the ligand by the ligand binding site of the receptor. 

Pheromone receptor polypeptides that are useful in the screening assays, preferably, are 
those selected from the group consisting of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 
26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52. Extracellular domains or portions 
thereof and portions of the transmembrane region, alone or coupled to one another, of these 
pheromone receptor polypeptides (indicated in the Examples) can be tested for their ability to 
inhibit receptor-ligand binding. 

These and other objects of the invention will be described in further detail in connection 
with the detailed description of the invention. 

All patents, patent publications, references and other information identified in this 
document are incorporated in their entirety herein by reference. 

Brief Description of the Drawings 

Figure 1 depicts a comparison of the deduced protein sequences encoded by VR 
cDNA clones. 

Figure 2 is a schematic comparison of ORs, VNRs, and Vrs. 
Figure 3 depicts a comparison of the deduced protein sequences encoded by the 
Go-VN cDNA clones. 

Brief Description of the Sequences 

SEQ ID NO. 1 is the nucleotide sequence of the mouse pheromone receptor VR1 
cDNA (GenBank Accession No. AF0 11411). 

SEQ ID NO. 2 is the predicted amino acid sequence of the polypeptide encoded by 
the mouse pheromone receptor VR1 cDNA (GenBank Accession No. AF01 141 1). 

SEQ ID NO. 3 is the nucleotide sequence of the mouse pheromone receptor VR2 
cDNA (GenBank Accession No. AF01 1412). 

SEQ ID NO. 4 is the predicted amino acid sequence of the polypeptide encoded by 
the mouse pheromone receptor VR2 cDNA (GenBank Accession No. AF01 1412). 

SEQ ID NO. 5 is the nucleotide sequence of the mouse pheromone receptor VR3 
cDNA (GenBank Accession No. AF01 1413). 
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SEQ ID NO. 6 is the predicted amino acid sequence of the polypeptide encoded by 
the mouse pheromone receptor VR3 cDNA (GenBank Accession No. AF01 1413). 

SEQ ID NO. 7 is the nucleotide sequence of the mouse pheromone receptor VR4 
cDNA (GenBank Accession No. AF01 1414). 
5 SEQ ID NO. 8 is the predicted amino acid sequence of the polypeptide encoded by 

the mouse pheromone receptor VR4 cDNA (GenBank Accession No. AF01 1414). 

SEQ ID NO. 9 is the nucleotide sequence of the mouse pheromone receptor VR5 
cDNA (GenBank Accession No. AF01 1415). 

SEQ ID NO. 10 is the predicted amino acid sequence of the polypeptide encoded by 
10 the mouse pheromone receptor VR5 cDNA (GenBank Accession No. AF01 1415). 

SEQ ID NO. 1 1 is the^nucleotide sequence of the mouse pheromone receptor VR6 
cDNA (GenBank Accession No. AF01 1416). 

SEQ ID NO. 12 is the predicted amino acid sequence of the polypeptide encoded by 
the mouse pheromone receptor VR6 cDNA (GenBank Accession No. AF01 1416). 
15 SEQ ID NO. 13 is the nucleotide sequence of the mouse pheromone receptor VR7 

cDNA (GenBank Accession No. AF01 1417). 

SEQ ID NO. 14 is the predicted amino acid sequence of the polypeptide encoded by 
the mouse pheromone receptor VR7 cDNA (GenBank Accession No. AF01 1417). 

SEQ ID NO. 1 5 is the nucleotide sequence of the mouse pheromone receptor VR8 
20 cDNA (GenBank Accession No. AF01 1418). 

SEQ ID NO. 16 is the predicted amino acid sequence of the polypeptide encoded by 
the mouse pheromone receptor VR8 cDNA (GenBank Accession No. AF01 1418). 

SEQ ID NO. 17 is the nucleotide sequence of the mouse pheromone receptor VR9 
cDNA (GenBank Accession No. AF01 1419). 
25 SEQ ID NO. 1 8 is the predicted amino acid sequence of the polypeptide encoded by 

the mouse pheromone receptor VR9 cDNA (GenBank Accession No. AF01 1419). 

SEQ ID NO. 19 is the nucleotide sequence of the mouse pheromone receptor VR10 
cDNA (GenBank Accession No. AF01 1420). 

SEQ ID NO. 20 is the predicted amino acid sequence of the polypeptide encoded by 
30 the mouse pheromone receptor VR10 cDNA (GenBank Accession No. AF01 1420). 

SEQ ID NO. 21 is the nucleotide sequence of the mouse pheromone receptor VR1 1 
cDNA (GenBank Accession No. AF01 1421). 
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SEQ ID NO. 22 is the predicted amino acid sequence of the polypeptide encoded by 
the mouse pheromone receptor VR1 1 cDNA (GenBank Accession No. AF01 1421). 

SEQ ID NO. 23 is the nucleotide sequence of the mouse pheromone receptor VR12 
cDNA (GenBank Accession No. AF01 1422). 
5 SEQ ID NO. 24 is the predicted amino acid sequence of the polypeptide encoded by 

the mouse pheromone receptor VR12 cDNA (GenBank Accession No. AF01 1422). 

SEQ ID NO. 25 is the nucleotide sequence of the mouse pheromone receptor VR13 
cDNA (GenBank Accession No. AF01 1423). 

SEQ ID NO. 26 is the predicted amino acid sequence of the polypeptide encoded by 
10 the mouse pheromone receptor VR1 3 cDNA (GenBank Accession No. AF01 1423). 

SEQ ID NO. 27 is the nucleotide sequence of the mouse pheromone receptor VR14 
cDNA (GenBank Accession No. AFO 1 1 424). 

SEQ ID NO. 28 is the predicted amino acid sequence of the polypeptide encoded by 
the mouse pheromone receptor VR14 cDNA (GenBank Accession No. AF01 1424). 
15 SEQ ID NO. 29 is the nucleotide sequence of the mouse pheromone receptor VR1 5 

cDNA (GenBank Accession No. AF01 1425). 

SEQ ID NO. 30 is the predicted amino acid sequence of the polypeptide encoded by 
the mouse pheromone receptor VR15 cDNA (GenBank Accession No. AF01 1425). 

SEQ ID NO. 3 1 is the nucleotide sequence of the mouse pheromone receptor VR16 
20 cDNA (GenBank Accession No. AF01 1426). 

SEQ ID NO. 32 is the predicted amino acid sequence of the polypeptide encoded by 
the mouse pheromone receptor VR16 cDNA (GenBank Accession No. AF01 1426). 

SEQ ID NO. 33 is the nucleotide sequence of the rat pheromone receptor Go-VNl 
cDNA (GenBank Accession No. AF016178). 
25 SEQ ID NO. 34 is the predicted amino acid sequence of the polypeptide encoded by 

the rat pheromone receptor Go-VNl cDNA (GenBank Accession No. AF016178). 

SEQ ID NO. 35 is the nucleotide sequence of the rat pheromone receptor Go-VN2 
cDNA (GenBank Accession No. AFO 161 79). 

SEQ ID NO. 36 is the predicted amino acid sequence of the polypeptide encoded by 
30 the rat pheromone receptor Go-VN2 cDNA (GenBank Accession No. AFO 1 6 1 79). 

SEQ ID NO. 37 is the nucleotide sequence of the rat pheromone receptor Go-VN3 
cDNA (GenBank Accession No. AF016180). 
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SEQ ID NO. 38 is the predicted amino acid sequence of the polypeptide encoded by 
the rat pheromone receptor Go-VN3 cDNA (GenBank Accession No. AF016180). 

SEQ ID NO. 39 is the nucleotide sequence of the rat pheromone receptor Go-VN4 
cDNA (GenBank Accession No. AF016181). 
5 SEQ ID NO. 40 is the predicted amino acid sequence of the polypeptide encoded by 

the rat pheromone receptor Go-VN4 cDNA (GenBank Accession No. AF016181). 

SEQ ID NO. 41 is the nucleotide sequence of the rat pheromone receptor Go-VN5 
cDNA (GenBank Accession No. AF016182). 

SEQ ID NO. 42 is the predicted amino acid sequence of the polypeptide encoded by 
10 the rat pheromone receptor Go-VN5 cDNA (GenBank Accession No. AF016182). 

SEQ ID NO. 43 is the nucleotide sequence of the rat pheromone receptor G0-VN6 
cDNA (GenBank Accession No. AF016183). 

SEQ ID NO. 44 is the predicted amino acid sequence of the polypeptide encoded by 
the rat pheromone receptor G0-VN6 cDNA (GenBank Accession No. AF016183). 
15 SEQ ID NO. 45 is the nucleotide sequence of the rat pheromone receptor Go-VN7 

cDNA (GenBank Accession No. AFO 1 6 1 84). 

SEQ ID NO. 46 is the predicted amino acid sequence of the polypeptide encoded by 
the rat pheromone receptor Go-VN7 cDNA (GenBank Accession No. AF016184). 

SEQ ID NO. 47 is the nucleotide sequence of the rat pheromone receptor Go-VN13C 
20 cDNA (GenBank Accession No. AFO 16 185). 

SEQ ID NO. 48 is the predicted amino acid sequence of the polypeptide encoded by 
the rat pheromone receptor Go-VN13C cDNA (GenBank Accession No. AF016185). 

SEQ ID NO. 49 is the nucleotide sequence of the rat pheromone receptor Go-VN13B 
cDNA (GenBank Accession No. AF016186). 
25 SEQ ID NO. 50 is the predicted amino acid sequence of the polypeptide encoded by 

the rat pheromone receptor Go-VN13B cDNA (GenBank Accession No. AF016186). 

SEQ ID NO. 51 is a partial nucleotide sequence of the human pheromone receptor 

hVRl. 

SEQ ID NO. 52 is the predicted amino acid sequence of the polypeptide encoded by 
30 the partial sequence of the human pheromone receptor hVRl . 

SEQ ID NO. 53 is a partial nucleotide sequence of the human pheromone receptor 
hVNOl. 
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SEQ ID NO. 54 is a partial nucleotide sequence of the human pheromone receptor 
hVN02. 

SEQ ID NO. 55 is a partial nucleotide sequence of the human pheromone receptor 
hVN03. 

5 SEQ ID NO. 56 is the nucleotide sequence of primer ALL 

SEQ ID NO. 57 is the nucleotide sequence of primer AL3. 

SEQ ID NO. 58 is a fifty amino acid sequence of Go-VN13B (SEQ ID NO. 50) that is 
absent from Go-VN 1 3C (SEQ ID NO. 48). 

SEQ ID NO. 59 is the amino acid sequence of a rat kidney extracellular calcium/ 
10 polyvalent cation-sensing receptor. 

SEQ ID NO. 60 is a degenerate oligonucleotide primer from a conserved VR domain. 

SEQ ID NO. 61 is a degenerate oligonucleotide primer from a conserved VR domain. 

SEQ ID NO. 62 is a degenerate oligonucleotide primer from a conserved VR domain. 

SEQ ID NO. 63 is a degenerate oligonucleotide primer from a conserved VR domain. 
15 SEQ ID NO. 64 is a degenerate oligonucleotide primer from a conserved VR domain. 

SEQ ID NO. 65 is a degenerate oligonucleotide primer from a conserved VR domain. 

SEQ ID NO. 66 is a degenerate oligonucleotide primer from a conserved VR domain. 

SEQ ID NO. 67 is a degenerate oligonucleotide primer from a conserved VR domain. 

SEQ ID NO. 68 is the nucleotide sequence of the coding region of the mouse 
20 pheromone receptor VRL 

SEQ ID NO. 69 is the nucleotide sequence of the coding region of the mouse 
pheromone receptor VR2. 

SEQ ID NO. 70 is the nucleotide sequence of the coding region of the mouse 
pheromone receptor VR3. 
25 SEQ ID NO. 71 is the nucleotide sequence of the coding region of the mouse 

pheromone receptor VR4. 

SEQ ID NO. 72 is the nucleotide sequence of the coding region of the mouse 
pheromone receptor VR5. 

SEQ ID NO. 73 is the nucleotide sequence of the coding region of the mouse 
30 pheromone receptor VR6. 

SEQ ID NO. 74 is the nucleotide sequence of the coding region of the mouse 
pheromone receptor VR7. 
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SEQ ID NO. 75 is the nucleotide sequence of the coding region of the mouse 
pheromone receptor VR8. 

SEQ ID NO. 76 is the nucleotide sequence of the coding region of the mouse 
pheromone receptor VR9. 
5 SEQ ID NO. 77 is the nucleotide sequence of the coding region of the mouse 

pheromone receptor VR10. 

SEQ ID NO. 78 is the nucleotide sequence of the coding region of the mouse 
pheromone receptor VR1 1 , 

SEQ ID NO. 79 is the nucleotide sequence of the coding region of the mouse 
10 pheromone receptor VR12. 

SEQ ID NO. 80 is the nucleotide sequence of the coding region of the mouse 
pheromone receptor VR13. 

SEQ ID NO. 8 1 is the nucleotide sequence of the coding region of the mouse 
pheromone receptor VR14. 
15 SEQ ID NO. 82 is the nucleotide sequence of the coding region of the mouse 

pheromone receptor VR15. 

SEQ ID NO. 83 is the nucleotide sequence of the coding region of the mouse 
pheromone receptor VR16. 

SEQ ID NO. 84 is the nucleotide sequence of the coding region of the rat pheromone 
20 receptor Go VN1. 

SEQ ID NO. 85 is the nucleotide sequence of the coding region of the rat pheromone 
receptor GoVN2. 

SEQ ID NO. 86 is the nucleotide sequence of the coding region of the rat pheromone 
receptor GoVN3. 

25 SEQ ID NO. 87 is the nucleotide sequence of the coding region of the rat pheromone 

receptor GoVN4. 

SEQ ID NO. 88 is the nucleotide sequence of the coding region of the rat pheromone 
receptor GoVN5. 

SEQ ID NO. 89 is the nucleotide sequence of the coding region of the rat pheromone 
30 receptor G0VN6. 

SEQ ID NO. 90 is the nucleotide sequence of the coding region of the rat pheromone 
receptor GoVN7. 
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SEQ ID NO. 91 is the nucleotide sequence of the coding region of the rat pheromone 
receptor GoVN13C. 

SEQ ID NO. 92 is the nucleotide sequence of the coding region of the rat pheromone 
receptor GoVN13B. 

5 

Petted Description of the Invention 

The present invention in one aspect involves the cloning of cDNAs encoding several 
members of a multigene family of pheromone receptors. Complete cDNA sequences for 
selected murine and rat pheromone receptors are provided. Partial sequences of the human gene 

10 also are provided. The present invention also relates to the discovery that this family of 
pheromone receptors is expressed in a Goq protein-expressing vomeronasal organ neurons ("G<$ + 
VNO") or in another olfactory organ neuron in an animal (preferably, a mammal and more 
preferably, a human) which lacks a vomeronasal organ. Throughout this description, the 
pheromone receptors of the invention alternatively are referred to as "pheromone receptors", 

15 "Gcc 0 + VNO pheromone receptors" or, simply, "Goto* VNO receptors". 

Analysis of the sequence homology between members of the receptor family by 
comparison to nucleic acid and protein databases established that the pheromone receptor family 
has several domains. These include, from amino terminus to carboxyl terminus: 
(a) an amino-terminal extracellular domain containing from 30 to 600 amino acids; (b) a 

20 transmembrane region comprising: (i) seven non-contiguous transmembrane domains designated 
TM1, TM2, TM3, TM4, TM5, TM6 and TM7, (ii) three non-contiguous extracellular domains 
designated EC2, EC3 and EC4, and (iii) three non-contiguous intracellular domains designated 
IC1 , IC2, and IC3, wherein the transmembrane domains, the extracellular domains and the 
intracellular domains are attached to one another from amino terminus to carboxyl terminus in 

25 the order TM1-IC1-TM2-EC2-TM3- IC2-TM4-EC3-TM5-IC3-TM6-EC4-TM7, and wherein the 
transmembrane region has at least about 35% homology and a length approximately equal to a 
transmembrane region of a polypeptide selected from the group consisting of SEQ ID NO. 2, 
4, 6, 8, 10, 12, 14, 34, 36, 38, 40, 42, 44, 46, 48, and 50; and (c) a carboxyl-terminal intracellular 
domain containing from 5 to 200 amino acids. Each polypeptide member of the family is 

30 expressed in a Gct 0 protein-expressing vomeronasal organ neuron or are expressed in another 
olfactory organ neuron in an animal which does not possess a vomeronasal organ. One skilled 
in the art can readily identify olfactory organs in animals which do not possess a vomeronasal 
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organ. The homology can be calculated using various, publicly available software tools 
developed by NCBI (Bethesda, Maryland) that can be obtained through the internet 
(ftp://ncbi.nlm.nih.gov/pub/). Exemplary tools include the BLAST system. Pairwise and 
ClustalW alignments (BLOSUM30 matrix setting) as well as Kyte-Doolittle hydropathic analysis 
5 can be obtained using the Mac Vector sequence analysis software (Oxford Molecular Group). 

The structure of the Ga^ VNO pherompne receptors suggests that these receptors are 
members of the large G protein-coupled receptor superfamily (GPCR). Like other GPCRs, the 
Ga 0 + VNO pheromone receptors exhibit seven hydrophobic stretches ("hydrophobic domains") 
and are similar in structure to other types of GPCRs, the calcium sensing receptor (CSR Ser. ID 

10 No. 59) and the metabotropic glutamate receptors (mGluRs). The CSR and mGluRs are unusual 
among the GPCRs in that they have extremely long N-terminal extracellular domain (e.g., 557- 
565 amino acids), a feature that is shared by the pheromone receptors of the invention. Despite 
this similarity, the receptors of the invention do not share substantial primary structure homology 
with the CSR and mGluRs. The receptors of the invention also are very different structurally 

15 from two other G-protein coupled receptors, the odorant receptors and Gai 2 + vomeronasal 
receptors, which share none of the characteristic sequence motifs of the receptors of the invention 
and, moreover, which have very small (-12-28 amino acids) N-terminal extracellular domains. 

The receptors of the invention differ somewhat in amino acid sequence, with regions of 
relatively high sequence homology. Refer to Examples 1 and 2 for a discussion and illustration 

20 of the amino acid sequence homology for the murine and rat Ga 0 + VNO receptors, respectively. 
Other features of these members of the Gcc 0 + VNO receptor family also are discussed and 
illustrated in the Examples. For example, signal sequences have been identified for several of 
the Ga 0 + VNO receptors disclosed in the Examples. 

Homologs and alleles of the pheromone receptor nucleic acids of the invention can be 

25 identified by conventional techniques. Thus, an aspect of the invention is those nucleic acid 
sequences (SEQ ID NOs. 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 
41 , 43, 45, 47, 49, 5 1 , 53, 54, and 55) which code for Gao + VNO pheromone receptors and which 
hybridize to a nucleic acid molecule consisting of the coding region of any one Ga 0 + VNO 
pheromone receptor selected from the group consisting of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 16, 

30 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52, under high or low 
stringency conditions. The term "high or low stringency conditions" as used herein refers to 
parameters with which the art is familiar. Nucleic acid hybridization parameters may be found 
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in references which compile such methods, e.g. Molecular Cloning: A Laboratory Manual, J. 
Sambrook, et al., eds., Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, New York, 1989, or Current Protocols in Molecular Biology, F.M. Ausubel, et al., eds., 
John Wiley & Sons, Inc., New York. More specifically, high stringency conditions, as used 
5 herein, refers, for example, to hybridization at 65°C in hybridization buffer (3.5 x SSC, 0.02% 
Ficoll, 0.02% polyvinyl pyrolidone, 0.02% Bovine Serum Albumin, 2.5mM NaH 2 P0 4 (pH7), 
0.5% SDS, 2mM EDTA). SSC is 0.1 5M sodium chloride/0. 15M sodium citrate, pH7; SDS is 
sodium dodecyl sulphate; and EDTA is ethylenediaminetetracetic acid. Low stringency 
conditions would be the same, but with a lower temperature (e.g., 55 °C). After hybridization, 

10 the membrane upon which the DNA is transferred is washed at 2 x SSC at room temperature and 
then at 0.2 x SSC/0.5% SDS at temperatures of up to 65°C. Additional conditions of varying 
stringency are provided in the Examples. 

There are other conditions, reagents, and so forth which can used, which result in a 
similar degree of stringency. The skilled artisan will be familiar with such conditions, and thus 

15 they are not given here. It will be understood, however, that the skilled artisan will be able to 
manipulate the conditions in a manner to permit the clear identification of homologs and alleles 
of the Goto + VNO pheromone receptor nucleic acids of the invention. The skilled artisan also is 
familiar with the methodology for screening cells and libraries for expression of such molecules 
which then are routinely isolated, followed by isolation of the pertinent nucleic acid molecule 

20 and sequencing. 

In general homologs and alleles typically will share at least 35% nucleotide identity 
and/or at least 50% amino acid identity to the cDNAs encoding a Ga 0 + VNO pheromone receptor 
polypeptide selected from the group consisting of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 
22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52, in some instances will share at 

25 least 50% nucleotide identity and/or at least 65% amino acid identity and in still other instances 
will share at least 60% nucleotide identity and/or at least 75% amino acid identity. Watson-Crick 
complements of the foregoing nucleic acids also are embraced by the invention. As discussed 
above in the Summary of the invention, certain domains within the pheromone receptors may 
share even greater sequence homology to a pheromone receptor polypeptide selected from the 

30 group consisting of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 
38, 40, 42, 44, 46, 48, 50 and 52. 
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In screening for Ga 0 + VNO pheromone receptor polypeptides, a Southern blot may be 
performed using the foregoing conditions, together with a radioactive probe. After washing the 
membrane to which the DNA is finally transferred, the membrane can be placed against X-ray 
film to detect the radioactive signal. 
5 The invention also includes degenerate nucleic acids which include alternative codons 

to those present in the native materials. For example, serine residues are encoded by the codons 
TCA, AGT, TCC, TCG, TCT and AGC. Each of the six codons is equivalent for the purposes 
of encoding a serine residue. Thus, it will be apparent to one of ordinary skill in the art that any 
of the serine-encoding nucleotide triplets may be employed to direct the protein synthesis 

10 apparatus, in vitro or in vivo, to incorporate a serine residue into an elongating Goq* VNO 
pheromone receptor polypeptide. Similarly, nucleotide sequence triplets which encode other 
amino acid residues include, but are not limited to,: CCA, CCC, CCG and CCT (proline 
codons); CGA, CGC, CGG, CGT, AGA and AGG (arginine codons); ACA, ACC, ACG and 
ACT (threonine codons); AAC and AAT (asparagine codons); and ATA, ATC and ATT 

1 5 (isoleucine codons). Other amino acid residues may be encoded similarly by multiple nucleotide 
sequences. Thus, the invention embraces degenerate nucleic acids that differ from the 
biologically isolated nucleic acids in codon sequence due to the degeneracy of the genetic code. 
In addition, areas of high similarity among pheromone receptors may differ in amino acid 
sequences such that they share many, but not all, amino acids. Their nucleotide sequences all 

20 differ accordingly. 

The invention also provides isolated unique fragments of the cDNAs encoding a Gcto + 
VNO polypeptide selected from the group consisting of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 16, 
18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52, or complements of these 
sequences. A unique fragment is one that is a 'signature' for the larger nucleic acid. It, for 

25 example, is long enough to assure that its precise sequence is not found in molecules outside of 
the Goo + VNO pheromone receptor nucleic acids defined above. Unique fragments can be used 
as probes in Southern blot assays to identify such nucleic acids, or can be used as primers in 
amplification assays such as those employing PCR. As known to those skilled in the art, large 
probes such as 200 nucleotides or more are preferred for certain uses such as Southern blots, 

30 while smaller fragments will be preferred for uses such as PCR. Unique fragments also can be 
used to produce fusion proteins for generating antibodies or determining binding of the 
polypeptide fragments, as demonstrated in the Examples, or for generating immunoassay 
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components. Likewise, unique fragments can be employed to produce nonfused fragments of 
the Ga 0 + VNO pheromone receptor polypeptides, useful, for example, in the preparation of 
antibodies, in immunoassays, and as a competitive binding partner of the pheromones and/or 
other ligands which bind to the Ga 0 + VNO pheromone receptor polypeptides, for example, in 
5 therapeutic applications. Unique fragments further can be used as antisense molecules to inhibit 
the expression of Ga^ VNO pheromone receptor nucleic acids and polypeptides, particularly for 
the insecticide and other fertility control purposes as described in greater detail below. 

As will be recognized by those skilled in the art, the size of the unique fragment will 
depend upon its conservancy in the genetic code. Thus, some regions of a cDNA selected from 

10 the group consisting of SEQ ID NO. 51, 53, 54, 55, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 
80, 8 1 , 82, 83, 84, 85, 86, 87, 88, 89, 90, 91 , and 92, that encodes a Gcto + VNO polypeptide, and 
its complement will require longer segments to be unique while others will require only short 
segments, typically between 12 and 32 nucleotides (e.g. 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 
23, 24, 25, 26, 27, 28, 29, 30, 31 and 32 bases long). Virtually any segment of the region of the 

1 5 cDNAs encoding the full length Gao + VNO polypeptide or their complements, that is 1 8 or more 
nucleotides in length will be unique. Those skilled in the art are well versed in methods for 
selecting such sequences, typically on the basis of the ability of the unique fragment to 
selectively distinguish the sequence of interest from non-Ga 0 + VNO pheromone receptor nucleic 
acids. A comparison of the sequence of the fragment to those on known data bases typically is 

20 all that is necessary, although in vitro confirmatory hybridization and sequencing analysis may 
be performed. 

As mentioned above, the invention embraces antisense oligonucleotides that selectively 
bind to a nucleic acid molecule encoding a Gcto + VNO pheromone receptor polypeptide, to 
decrease a pheromone receptor activity (e.g., a ligand binding activity, a signal transduction 

25 activity). This is desirable in virtually any condition wherein a reduction in pheromone binding 
or induction of a behavior that is triggered by pheromone binding is desirable, including to 
control fertility and behavior in vertebrates and invertebrates. The compositions of the invention 
are particularly useful in, for example, controlling fertility in livestock and controlling 
reproduction in rodents or insects by interrupting the normal behaviors of rodents or insects that 

30 result in reproduction. As used herein, the term "antisense oligonucleotide" or "antisense" 
describes an oligonucleotide that is an oligoribonucleotide, oligodeoxyribonucleotide, modified 
oligoribonucleotide, or modified oligodeoxyribonucleotide which hybridizes under physiological 
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conditions to DNA comprising a particular gene or to an mRNA transcript of that gene and, 
thereby, inhibits the transcription of that gene and/or the translation of that mRNA. The 
antisense molecules are designed so as to interfere with transcription or translation of a target 
gene upon hybridization with the target gene or transcript. Those skilled in the art will recognize 
5 that the exact length of the antisense oligonucleotide and its degree of complementarity with its 
target will depend upon the specific target selected, including the sequence of the target and the 
particular bases which comprise that sequence. It is preferred that the antisense oligonucleotide 
be constructed and arranged so as to bind selectively with the target under physiological 
conditions, i.e., to hybridize substantially more to the target sequence than to any other sequence 

10 in the target cell under physiological conditions. Based upon the cDNA sequences of Examples 
1 and 2 (SEQ ID NOs. 1, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 
43, 45, 47, 49, 51, 53, 54, and 55), or upon allelic or homologous genomic and/or cDNA 
sequences, one of skill in the art can easily choose and synthesize any of a number of appropriate 
antisense molecules for use in accordance with the present invention. In order to be sufficiently 

15 selective and potent for inhibition, such antisense oligonucleotides should comprise at least 10 
and, more preferably, at least 15 consecutive bases which are complementary to the target, 
although in certain cases modified oligonucleotides as short as 7 bases in length have been used 
successfully as antisense oligonucleotides (Wagner et al., Nature Biotechnol 14:840-844, 1996). 
Most preferably, the antisense oligonucleotides comprise a complementary sequence of 20-30 

20 bases. Although oligonucleotides may be chosen which are antisense to any region of the gene 
or mRNA transcripts, in preferred embodiments the antisense oligonucleotides correspond to N- 
terminal or 5' upstream sites such as translation initiation, transcription initiation or promoter 
sites. In addition, 3-untranslated regions may be targeted. Targeting to mRNA splicing sites has 
also been used in the art but may be less preferred if alternative mRNA splicing occurs. In 

25 addition, the antisense is targeted, preferably, to sites in which mRNA secondary structure is not 
expected (see, e.g., Sainio et al., Cell Mol NeurobioL 14(5):439-457, 1994) and at which 
proteins are not expected to bind. Finally, although, Examples 1 and 2 disclose cDNA sequences 
(SEQ ID NOs. 1, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 
47, 49, 51, 53, 54, and 55), one of ordinary skill in the art may easily derive the genomic DNA 

30 corresponding to the cDNA of these cDNAs. Thus, the present invention also provides for 
antisense oligonucleotides which are complementary to the genomic DNA corresponding to a 
cDNA sequence selected from the group consisting of SEQ ID NOs. 1, 3, 5, 7, 9, 1 1, 13, 15, 17, 
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19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 54, and 55. Similarly, 
antisense to allelic or homologous cDNAs and genomic DNAs are enabled without undue 
experimentation. 

In one set of embodiments, the antisense oligonucleotides of the invention may be 
5 composed of "natural" deoxyribonucleotides, ribonucleotides, or any combination thereof. That 
is, the 5* end of one native nucleotide and the 3' end of another native nucleotide may be 
covalently linked, as in natural systems, via a phosphodiester internucleoside linkage. These 
oligonucleotides may be prepared by art recognized methods which may be carried out manually 
or by an automated synthesizer. They also may be produced recombinantly by vectors. 

10 In preferred embodiments, however, the antisense oligonucleotides of the invention also 

may include "modified" oligonucleotides. That is, the oligonucleotides may be modified in a 
number of ways which do not prevent them from hybridizing to their target but which enhance 
their stability or targeting or which otherwise enhance their therapeutic effectiveness. 

The term "modified oligonucleotide" as used herein describes an oligonucleotide in 

15 which (1) at least two of its nucleotides are covalently linked via a synthetic internucleoside 
linkage (i.e., a linkage other than a phosphodiester linkage between the 5* end of one nucleotide 
and the 3' end of another nucleotide) and/or (2) a chemical group not normally associated with 
nucleic acids has been covalently attached to the oligonucleotide. Preferred synthetic 
internucleoside linkages are phosphorothioates, alkylphosphonates, phosphorodithioates, 

20 phosphate esters, alkylphosphonothioates, phosphoramidates, carbamates, carbonates, phosphate 
triesters, acetamidates, carboxymethyl esters and peptides. 

The term "modified oligonucleotide" also encompasses oligonucleotides with a 
covalently modified base and/or sugar. For example, modified oligonucleotides include 
oligonucleotides having backbone sugars which are covalently attached to low molecular weight 

25 organic groups other than a hydroxyl group at the 3* position and other than a phosphate group 
at the 5' position. Thus modified oligonucleotides may include a 2'-0-alkylated ribose group. 
In addition, modified oligonucleotides may include sugars such as arabinose instead of ribose. 
The present invention, thus, contemplates pharmaceutical preparations containing modified 
antisense molecules that are complementary to and hybridizable with, under physiological 

30 conditions, nucleic acids encoding pheromone receptor polypeptides, together with 
pharmaceutical^ acceptable carriers. 
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Antisense oligonucleotides may be administered as part of a pharmaceutical composition. 
Such a pharmaceutical composition may include the antisense oligonucleotides in combination 
with any standard physiologically and/or pharmaceutically acceptable carriers which are known 
in the art The compositions should be sterile and contain a therapeutically effective amount of 

5 the antisense oligonucleotides in a unit of weight or volume suitable for administration to a 
patient. The term "pharmaceutically acceptable" means a non-toxic material that does not 
interfere with the effectiveness of the biological activity of the active ingredients. The term 
"physiologically acceptable" refers to a non-toxic material that is compatible with a biological 
system such as a cell, cell cblture, tissue, or organism. The characteristics of the carrier will 

10 depend on the route of administration. Physiologically and pharmaceutically acceptable carriers 
include diluents, fillers, salts, buffers, stabilizers, solubilizers, and other materials which are well 
known in the art. 

As used herein, a "vector" may be any of a number of nucleic acids into which a desired 
sequence may be inserted by restriction and ligation for transport between different genetic 

15 environments or for expression in a host cell. Vectors are typically composed of DNA although 
RNA vectors are also available. Vectors include, but are not limited to, plasmids, phagemids and 
virus genomes. A cloning vector is one which is able to replicate in a host cell, and which is 
further characterized by one or more endonuclease restriction sites at which the vector may be 
cut in a determinable fashion and into which a desired DNA sequence may be ligated such that 

20 the new recombinant vector retains its ability to replicate in the host cell. In the case of plasmids, 
replication of the desired sequence may occur many times as the plasmid increases in copy 
number within the host bacterium or just a single time per host before the host reproduces by 
mitosis. In the case of phage, replication may occur actively during a lytic phase or passively 
during a lysogenic phase. An expression vector is one into which a desired DNA sequence may 

25 be inserted by restriction and ligation such that it is operably joined to regulatory sequences and 
may be expressed as an RNA transcript. Vectors may further contain one or more marker 
sequences suitable for use in the identification of cells which have or have not been transformed 
or transfected with the vector. Markers include, for example, genes encoding proteins which 
increase or decrease either resistance or sensitivity to antibiotics or other compounds, genes 

30 which encode enzymes whose activities are detectable by standard assays known in the art (e.g., 
B-galactosidase or alkaline phosphatase), and genes which visibly affect the phenotype of 
transformed or transfected cells, hosts, colonies or plaques (e.g., green fluorescent protein). 
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Preferred vectors are those capable of autonomous replication and expression of the structural 
gene products present in the DNA segments to which they are operably joined. 

As used herein, a coding sequence and regulatory sequences are said to be "operably" 
joined when they are covalently linked in such a way as to place the expression or transcription 
5 of the coding sequence under the influence or control of the regulatory sequences. If it is desired 
that the coding sequences be translated into a functional protein, two DNA sequences are said 
to be operably joined if induction of a promoter in the 5' regulatory sequences results in the 
transcription of the coding sequence and if the nature of the linkage between the two DNA 
. sequences does not (1) result in the introduction of a frame-shift mutation, (2) interfere with the 

1 0 ability of the promoter region to direct the transcription of the coding sequences, or (3) interfere 
with the ability of the corresponding UNA transcript to be translated into a protein. Thus, a 
promoter region would be operably joined to a coding sequence if the promoter region were 
capable of effecting transcription of that DNA sequence such that the resulting transcript might 
be translated into the desired protein or polypeptide. 

15 The precise nature of the regulatory sequences needed for gene expression may vary 

between species or cell types, but shall in general include, as necessary, 5' non-transcribed and 
5* non-translated sequences involved with the initiation of transcription and translation 
respectively, such as a TATA box, capping sequence, CAAT sequence, and the like. Especially, 
such 5* non-transcribed regulatory sequences will include a promoter region which includes a 

20 promoter sequence for transcriptional control of the operably joined gene. Regulatory sequences 
may also include enhancer sequences or upstream activator sequences as desired. The vectors 
of the invention may optionally include 5' leader or signal sequences. The choice and design of 
an appropriate vector is within the ability and discretion of one of ordinary skill in the art. 

Expression vectors containing all the necessary elements for expression are commercially 

25 available and known to those skilled in the art. See, e.g., Sambrook et al., Molecular Cloning: 
A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, 1989. Cells are 
genetically engineered by the introduction into the cells of heterologous DNA (RNA) encoding 
pheromone receptor polypeptide or fragment or variant thereof. That heterologous DNA (RNA) 
is placed under operable control of transcriptional elements to permit the expression of the 

30 heterologous DNA in the host cell. 

Preferred systems for mRNA expression in mammalian cells are those such as pRc/CMV 
(available from Invitrogen, Carlsbad, CA) that contain a selectable marker such as a gene that 
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confers G418 resistance (which facilitates the selection of stably transfected cell lines) and the 
human cytomegalovirus (CMV) enhancer-promoter sequences. Additionally, suitable for 
expression in primate or canine cell lines is the pCEP4 vector (Invitrogen), which contains an 
Epstein Barr virus (EBV) origin of replication, facilitating the maintenance of plasmid as a 
5 multicopy extrachromosomal element. Another expression vector is the pEF-BOS plasmid 
containing the promoter of polypeptide Elongation Factor la, which stimulates efficiently 
transcription in vitro. The plasmid is described by Mishizuma and Nagata (Nuc. Acids Res. 
18:5322, 1990), and its use in transfection experiments is disclosed by, for example, Demoulin 
(Mol. Cell. Biol. 16:4710-4716, 1996). Still another preferred expression vector is an adenovirus, 
10 described by Stratford-Perricaudet, which is defective for El and E3 proteins (J. Clin. Invest. 
90:626-630, 1992). The use of the adenovirus as an Adeno.PlA recombinant is disclosed by 
Warnier et al., in intradermal injection in mice for immunization against PI A (Int. J. Cancer, 
67:303-310, 1996). 

The invention also embraces so-called expression kits, which allow the artisan to prepare 
1 5 a desired expression vector or vectors. Such expression kits include at least separate portions of 
each of the previously discussed coding sequences. Other components may be added, as desired, 
as long as the previously mentioned sequences, which are required, are included. 

The invention also permits the construction of pheromone receptor gene "knock-outs" 
in cells and in animals, providing materials for studying certain aspects of pheromone receptor 
20 binding, signal transduction activity, or function. 

The invention also provides isolated polypeptides, which include a pheromone receptor 
polypeptide selected from the group consisting of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 
22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52 and unique fragments of these 
pheromone receptor polypeptides. Such polypeptides are useful, for example, alone or as fusion 
25 proteins to generate antibodies. 

A unique fragment of a pheromone receptor polypeptide, in general, has the features and 
characteristics of unique fragments as discussed above in connection with nucleic acids. As will 
be recognized by those skilled in the art, the size of the unique fragment will depend upon factors 
such as whether the fragment constitutes a portion of a conserved protein domain. Thus, some 
30 regions of a pheromone receptor polypeptide selected from the group consisting of SEQ ID NO. 
2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52 
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will require longer segments to be unique while others will require only short segments, typically 
between 5 and 12 amino acids (e.g. 5, 6, 7, 8, 9, 10, 1 1 and 12 amino acids long). 

Unique fragments of a polypeptide preferably are those fragments which retain a distinct 
functional capability of the polypeptide. Functional capabilities which can be retained in a 

5 unique fragment of a polypeptide include interaction with antibodies, interaction with other 
polypeptides (G-proteins) or molecules (e.g., a ligand) or fragments thereof, selective binding 
of nucleic acids or proteins, and enzymatic activity. Those skilled in the art are well versed in 
methods for selecting unique amino acid sequences, typically on the basis of the ability of the 
unique fragment to selectively distinguish the sequence of interest from non-family members. 

10 A comparison of the sequence of the fragment to those on known data bases typically is all that 
is necessary. 

The invention embraces variants of the pheromone receptor polypeptides described 
above. As used herein, a 4t variant" of a pheromone receptor polypeptide is a polypeptide which 
contains one or more modifications to the primary amino acid sequence of a pheromone receptor 

15 polypeptide. Modifications which create a pheromone receptor variant can be made to a 
pheromone receptor polypeptide 1) to reduce or eliminate an activity of a pheromone receptor 
polypeptide, such as a ligand binding activity or a signal transduction activity; 2) to enhance a 
property of a pheromone receptor polypeptide, such as protein stability in an expression system 
or the stability of protein-protein binding; or 3) to provide a novel activity or property to a 

20 pheromone receptor polypeptide, such as addition of an antigenic epitope or addition of a 
detectable moiety. Modifications to a pheromone receptor polypeptide are typically made to the 
nucleic acid which encodes the pheromone receptor polypeptide, and can include deletions, point 
mutations, truncations, amino acid substitutions and additions of amino acids or non-amino acid 
moieties. Alternatively, modifications can be made directly to the polypeptide, such as by 

25 cleavage, addition of a linker molecule, addition of a detectable moiety, such as biotin, addition 
of a fatty acid, and the like. Modifications also embrace fusion proteins comprising all or part 
of the pheromone receptor amino acid sequence. 

In general, variants include pheromone receptor polypeptides which are modified 
specifically to alter a feature of the polypeptide unrelated to its physiological activity. For 

30 example, cysteine residues can be substituted or deleted to prevent unwanted disulfide linkages. 
Similarly, certain amino acids can be changed to enhance expression of a pheromone receptor 
polypeptide by eliminating proteolysis by proteases in an expression system. 
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Mutations of a nucleic acid which encode a pheromone receptor polypeptide preferably 
preserve the amino acid reading frame of the coding sequence, and preferably do not create 
regions in the nucleic acid which are likely to hybridize to form secondary structures, such a 
hairpins or loops, which can be deleterious to expression of the variant polypeptide. 
5 Mutations can be made by selecting an amino acid substitution, or by random 

mutagenesis of a selected site in a nucleic acid which encodes the polypeptide. Variant 
polypeptides are then expressed and tested for one or more activities to determine which 
mutation provides a variant polypeptide with the desired properties. Further mutations can be 
made to variants (or to non-variant pheromone receptor polypeptides) which are silent as to the 

10 amino acid sequence of the polypeptide, but which provide preferred codons for translation in 
a particular host The preferred codons for translation of a nucleic acid in, e.g., £. coli 9 are well 
known to those of ordinary skill in the art. Still other mutations can be made to the noncoding 
sequences of a pheromone receptor gene or cDNA clone to enhance expression of the 
polypeptide. The activity of variants of pheromone receptor polypeptides can be tested by 

15 cloning the gene encoding the variant pheromone receptor polypeptide into a bacterial or 
mammalian expression vector, introducing the vector into an appropriate host cell, expressing 
the variant pheromone receptor polypeptide, and testing for a functional capability of the 
pheromone receptor polypeptides as disclosed herein. For example, the variant pheromone 
receptor polypeptide can be tested for a ligand binding activity, wherein a ligand to which the 

20 receptor binds is contacted with the variant receptor and the amount of ligand binding to the 
variant receptor is determined using conventional procedures to measure the binding of one 
molecule to another. Preparation of other variant polypeptides may favor testing of other 
activities, as will be known to one of ordinary skill in the art. 

The skilled artisan will also realize that conservative amino acid substitutions may be 

25 made in pheromone receptor polypeptides to provide functionally equivalent variants of the 
foregoing polypeptides, i.e, the variants retain the functional capabilities of the pheromone 
receptor polypeptides. As used herein, a "conservative amino acid substitution" refers to an 
amino acid substitution which does not alter the relative charge or size characteristics of the 
protein in which the amino acid substitution is made. Variants can be prepared according to 

30 methods for altering polypeptide sequence known to one of ordinary skill in the art such as are 
found in references which compile such methods, e.g. Molecular Cloning: A Laboratory Manual, 
J. Sambrook, et al., eds., Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring 
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Harbor, New York, 1989, or Current Protocols in Molecular Biology, F.M. AusubeL et al., eds., 
John Wiley & Sons, Inc., New York. To a certain extent, the various members of the pheromone 
receptor family that are illustrated in the Examples represent exemplary functionally equivalent 
variants of the pheromone receptor polypeptides. Other functionally equivalent variants include 
5 conservative amino acid substitutions of the amino acids of a pheromone receptor polypeptide 
selected from the group consisting of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 
28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52. Conservative substitutions of amino acids 
include substitutions made amongst amino acids within the following groups: (a) M, I, L, V; (b) 
F, Y, W; (c) K, R, H; (d) A, G; (e) S, T; (f) Q, N; and (g) E, D. 

1 0 Conservative amino-acid substitutions in the amino acid sequence of pheromone receptor 

polypeptides to produce functionally equivalent variants of pheromone receptor polypeptides 
typically are made by alteration of the nucleic acid encoding pheromone receptor polypeptides. 
Such substitutions can be made by a variety of methods known to one of ordinary skill in the art. 
For example, amino acid substitutions may be made by PCR-directed mutation, site-directed 

15 mutagenesis according to the method described in Proa Nat Acad Sci. USA. 82: 488-492, 
1985, or by chemical synthesis of a gene encoding a pheromone receptor polypeptide. Where 
amino acid substitutions are made to a small unique fragment of a pheromone receptor 
polypeptide, such as a ligand binding site peptide, the substitutions can be made by directly 
synthesizing the peptide. The activity of functionally equivalent fragments of pheromone 

20 receptor polypeptides can be tested by cloning the gene encoding the altered pheromone receptor 
polypeptide into a bacterial or mammalian expression vector, introducing the vector into an 
appropriate host cell, expressing the altered pheromone receptor polypeptide, and testing for a 
functional capability of the pheromone receptor polypeptides as disclosed herein. Peptides which 
are chemically synthesized can be tested directly for function, e.g., for binding to a ligand to 

25 which the unaltered pheromone receptor is known to bind. 

The invention as described herein has a number of uses, some of which are described 
elsewhere herein. First, the invention permits isolation of the pheromone receptor polypeptides 
of the Examples. A variety of methodologies well-known to the skilled practitioner can be 
utilized to obtain isolated pheromone receptor molecules. The polypeptide may be purified from 

30 cells which naturally produce the polypeptide by chromatographic means or immunological 
recognition. Alternatively, an expression vector may be introduced into cells to cause production 
of the polypeptide. In another method, mRNA transcripts may be microinjected or otherwise 
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introduced into cells to cause production of the encoded polypeptide. Translation of mRNA in 
cell-free extracts such as the reticulocyte lysate system also may be used to produce polypeptide. 
Those skilled in the art also can readily follow known methods for isolating pheromone receptor 
polypeptides. These include, but are not limited to, immunochromatography, HPLC, 
5 size-exclusion chromatography, ion-exchange chromatography and immune-affinity 
chromatography. 

The isolation of the pheromone receptor gene also makes it possible for the artisan to 
diagnose a disorder characterized by expression of pheromone receptor . These methods involve 
determining expression of the pheromone receptor gene, and/or pheromone receptor 

1 0 polypeptides derived therefrom. In the former situation, such determinations can be carried out 
via any standard nucleic acid determination assay, including the polymerase chain reaction as 
exemplified in the examples below, or assaying with labeled hybridization probes. 

The invention also makes it possible to isolate the naturally occurring ligands 
(pheromones) and other ligands that have a ligand binding domain, namely, by the binding of 

1 5 such molecules to the pheromone receptor polypeptides (or fragments thereof containing a ligand 
binding site). Binding of the receptors to a ligand can be accomplished by introducing into a 
biological system in which the proteins bind (e.g., a cell) a molecule that includes a binding 
domain (putative ligand) in an amount sufficient to detect the binding. 

The invention also provides agents such as binding polypeptides which bind to 

20 pheromone receptor polypeptides and/or to complexes of pheromone receptor polypeptides and 
their ligand binding partners. Such binding agents can be used, for example, in screening assays 
to detect the presence or absence of pheromone receptor polypeptides and complexes of 
pheromone receptor polypeptides and their ligand binding partners and in purification protocols 
to isolate pheromone receptor polypeptides and complexes of pheromone receptor polypeptides 

25 and their ligand binding partners. Such agents also can be used to inhibit the native activity of 
the pheromone receptor polypeptides or their ligand binding partners, for example, by binding 
to such polypeptides, or their binding partners or both. 

The invention, therefore, embraces peptide binding agents which, for example, can be 
antibodies or fragments of antibodies having the ability to selectively bind to pheromone receptor 

30 polypeptides. Antibodies include polyclonal and monoclonal antibodies, prepared according to 
conventional methodology. 



WO 39/00422 PCT/US98/13680 

-32- 

Significantly, as is well-known in the art, only a small portion of an antibody molecule, 
the paratope, is involved in the binding of the antibody to its epitope (see, in general, Clark, W.R. 
(1986) The Experimental Foundations of Modern Immunology Wiley & Sons, Inc., New York; 
Roitt, I. (1 991) Essential Immunology, 7th Ed., Blackwell Scientific Publications, Oxford). The 
5 pFc' and Fc regions, for example, are effectors of the complement cascade but are not involved 
in antigen binding. An antibody from which the pFc* region has been enzymatically cleaved, or 
which has been produced without the pFc' region, designated an F(ab , ) 2 fragment, retains both 
of the antigen binding sites of an intact antibody. Similarly, an antibody from which the Fc 
region has been enzymatically cleaved, or which has been produced without the Fc region, 

10 designated an Fab fragment, retains one of the antigen binding sites of an intact antibody 
molecule. Proceeding further, Fab fragments consist of a covalently bound antibody light chain 
and a portion of the antibody heavy chain denoted Fd. The Fd fragments are the major 
determinant of antibody specificity (a single Fd fragment may be associated with up to ten 
different light chains without altering antibody specificity) and Fd fragments retain epitope- 

15 binding ability in isolation. 

Within the antigen-binding portion of an antibody, as is well-known in the art, there are 
complementarity determining regions (CDRs), which directly interact with the epitope of the 
antigen, and framework regions (FRs), which maintain the tertiary structure of the paratope (see, 
in general, Clark, 1986; Roitt, 1991). In both the heavy chain Fd fragment and the light chain 

20 of IgG immunoglobulins, there are four framework regions (FR1 through FR4) separated 
respectively by three complementarity determining regions (CDR1 through CDR3). The CDRs, 
and in particular the CDR3 regions, and more particularly the heavy chain CDR3, are largely 
responsible for antibody specificity. 

It is now well-established in the art that the non-CDR regions of a mammalian antibody 

25 may be replaced with similar regions of nonspecific or heterospecific antibodies while retaining 
the epitopic specificity of the original antibody. This is most clearly manifested in the 
development and use of "humanized" antibodies in which non-human CDRs are covalently 
joined to human FR and/or Fc/pFc 1 regions to produce a functional antibody. Thus, for example, 
PCT International Publication Number WO 92/04381 teaches the production and use of 

30 humanized murine RSV antibodies in which at least a portion of the murine FR regions have 
been replaced by FR regions of human origin. Such antibodies, including fragments of intact 
antibodies with antigen-binding ability, are often referred to as "chimeric" antibodies. 
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Thus, as will be apparent to one of ordinary skill in the art, the present invention also 
provides for F(ab') 2 , Fab, Fv and Fd fragments; chimeric antibodies in which the Fc and/or FR 
and/or CDR1 and/or CDR2 and/or light chain CDR3 regions have been replaced by homologous 
human or non-human sequences; chimeric F(ab') 2 fragment antibodies in which the FR and/or 
5 CDR1 and/or CDR2 and/or light chain CDR3 regions have been replaced by homologous human 
or non-human sequences; chimeric Fab fragment antibodies in which the FR and/or CDR1 and/or 
CDR2 and/or light chain CDR3 regions have been replaced by homologous human or non-human 
sequences; and chimeric Fd fragment antibodies in which the FR and/or CDR1 and/or CDR2 
regions have been replaced by homologous human or non-human sequences. The present 

10 invention also includes so-called single chain antibodies. 

Thus, the invention involves polypeptides of numerous size and type that bind 
specifically to pheromone receptor polypeptides, and/or complexes of both pheromone receptor 
polypeptides and their ligand binding partners. These polypeptides may be derived also from 
sources other than antibody technology. For example, such polypeptide binding agents can be 

15 provided by degenerate peptide libraries which can be readily prepared in solution, in 
immobilized form or as phage display libraries. Combinatorial libraries also can be synthesized 
of peptides containing one or more amino acids. Libraries further can be synthesized of peptoids 
and non-peptide synthetic moieties. 

Phage display can be particularly effective in identifying binding peptides useful 

20 according to the invention. Briefly, one prepares a phage library (using e.g. ml 3, fd, or lambda 
phage), displaying inserts from 4 to about 80 amino acid residues using conventional procedures. 
The inserts may represent, for example, a completely degenerate or biased array. One then can 
select phage-bearing inserts which bind to the pheromone receptor polypeptide. This process 
can be repeated through several cycles of reselection of phage that bind to the pheromone 

25 receptor polypeptide. Repeated rounds lead to enrichment of phage bearing particular 
sequences. DNA sequence analysis can be conducted to identify the sequences of the expressed 
polypeptides. The minimal linear portion of the sequence that binds to the pheromone receptor 
polypeptide can be determined. One can repeat the procedure using a biased library containing 
inserts containing part or all of the minimal linear portion plus one or more additional degenerate 

30 residues upstream or downstream thereof. Yeast two-hybrid screening methods also may be used 
to identify polypeptides that bind to the pheromone receptor polypeptides. Thus, the pheromone 
receptor polypeptides of the invention, or a fragment thereof, can be used to screen peptide 
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libraries, including phage display libraries, to identify and select peptide binding partners of the 
pheromone receptor polypeptides of the invention. Such molecules can be used, as described, 
for screening assays, for purification protocols, for interfering directly with the functioning of 
pheromone receptor and for other purposes that will be apparent to those of ordinary skill in the 
5 art. 

A pheromone receptor polypeptide, or a fragment which contains the ligand binding site, 
also can be used to isolate naturally-occurring ligands and other binding partners of the receptors 
of the invention. For example, an isolated pheromone receptor can be used to isolate ligands 
that bind to the receptor binding site by immobilizing a receptor (or fragment containing the 

10 ligand binding site) on a chromatographic media, such as polystyrene beads, or a filter, and 
using the immobilized polypeptide to isolate molecules that bind to this affinity matrix in 
accordance with standard procedures for affinity chromatography. 

It will also be recognized that the invention embraces the use of the pheromone receptor 
cDNA sequences in expression vectors, as well as to transfect host cells and cell lines, be these 

15 prokaryotic (e.g., E. coif), or eukaryotic (e.g., CHO cells, COS cells, yeast expression systems 
and recombinant baculovirus expression in insect cells). Especially useful are oocytes, 
mammalian cells such as mouse, hamster, pig, goat, primate, etc. They may be of a wide variety 
of tissue types, and include primary cells and cell lines. The expression vectors require that the 
pertinent sequence, i.e., those nucleic acids described supra, be operably linked to a promoter. 

20 

When administered, the therapeutic compositions of the present invention are 
administered in pharmaceutically acceptable preparations. Such preparations may routinely 
contain pharmaceutically acceptable concentrations of salt, buffering agents, preservatives, 
compatible carriers, supplementary immune potentiating agents such as adjuvants and cytokines 

25 and optionally other therapeutic agents. 

The therapeutics of the invention can be administered by any conventional route, 
including injection or by gradual infusion over time. The administration may, for example, be 
oral, intravenous, intraperitoneal, intramuscular, intracavity, subcutaneous, or transdermal. 
When antibodies are used therapeutically, a preferred route of administration is by pulmonary 

30 aerosol. Techniques for preparing aerosol delivery systems containing antibodies are well known 
to those of skill in the art. Generally, such systems should utilize components which will not 
significantly impair the biological properties of the antibodies, such as the paratope binding 
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capacity (see, for example, Sciarra and Cutie, "Aerosols," in Remington's Pharmaceutical 
Sciences . 18th edition, 1990, pp 1694-1712; incorporated by reference). Those of skill in the art 
can readily determine the various parameters and conditions for producing antibody aerosols 
without resort to undue experimentation. When using antisense preparations of the invention, 
5 slow intravenous administration is preferred. 

Preparations for parenteral administration include sterile aqueous or non-aqueous 
solutions, suspensions, and emulsions. Examples of non-aqueous solvents are propylene glycol, 
polyethylene glycol, vegetable oils such as olive oil, and injectable organic esters such as ethyl 
oleate. Aqueous carriers include water, alcoholic/aqueous solutions, emulsions or suspensions, 

10 including saline and buffered media. Parenteral vehicles include sodium chloride solution, 
Ringer's dextrose, dextrose and sodium chloride, lactated Ringer's or fixed oils. Intravenous 
vehicles include fluid and nutrient replenishers, electrolyte replenishers (such as those based on 
Ringer's dextrose), and the like. Preservatives and other additives may also be present such as, 
for example, antimicrobials, anti-oxidants, chelating agents, and inert gases and the like. 

15 The preparations of the invention are administered in effective amounts. An effective 

amount is that amount of a pharmaceutical preparation that alone, or together with further doses, 
produces the desired response in the condition being treated, e.g., modifying fertility or 
pheromone-mediated behaviors that are related to reproduction or aggression. For example, this 
can involve the use of the compounds of the invention as pesticides to slow or halt insect or 

20 rodent behaviors that result in reproduction. Alternatively, this can involve the use of the 
compounds of the invention as agents for controlling fertility in animals (e.g., livestock, domestic 
animals), by providing compounds which inhibit or stimulate the behaviors in such animals that 
result in reproduction or agression. This can be monitored by routine methods, e.g., observing 
the behavior in the animal (vertebrate or invertebrate) recipient. 

25 The invention also contemplates gene therapy, e.g., to prepare an animal model for 

studying the conditions and behaviors (e.g., fertility, aggression) that are pheromone receptor- 
mediated. The procedure for performing ex vivo gene therapy is outlined in U.S. Patent 
5,399,346 and in exhibits submitted in the file history of that patent, all of which are publicly 
available documents. In general, it involves introduction in vitro of a functional copy of a gene 

30 into a cell(s) of a subject which contains a defective copy of the gene, and returning the 
genetically engineered cell(s) to the subject. The functional copy of the gene is under operable 
control of regulatory elements which permit expression of the gene in the genetically engineered 
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cell(s). Numerous transfection and transduction techniques as well as appropriate expression 
vectors are well known to those of ordinary skill in the art, some of which are described in PCT 
application WO95/00654. In vivo gene therapy using vectors such as adenovirus, retroviruses, 
herpes virus, and targeted liposomes also is contemplated according to the invention. 
5 The invention further provides efficient methods of identifying pharmacological agents 

or lead compounds for agents active at the level of a pheromone receptor or pheromone receptor 
fragment modulatable cellular function. In particular, such functions include ligand binding 
activity. Generally, the screening methods involve assaying for activation of pheromone 
receptors or assaying for compounds which interfere with a pheromone receptor activity such 

1 0 as pheromone receptor binding to its cognate ligand. Such methods are adaptable to automated, 
high throughput screening of compounds. The target therapeutic indications for pharmacological 
agents detected by the screening methods that block pheromone receptor activity are limited only 
in that the target cellular function be subject to modulation by alteration of the formation of a 
complex comprising a pheromone receptor polypeptide or fragment thereof and one or more 

15 natural pheromone receptor ligands. Target indications include cellular processes modulated by 
pheromone receptor signal transduction following receptor-ligand binding. 

A wide variety of assays for pharmacological agents are provided, including, labeled in 
vitro protein-protein binding assays, electrophoretic mobility shift assays, immunoassays, cell- 
based assays such as two- or three-hybrid screens, expression assays, activation of G-proteins, 

20 etc. For example, three-hybrid screens are used to rapidly examine the effect of transfected 
nucleic acids on the intracellular binding of pheromone receptor or pheromone receptor 
fragments to specific extracellular targets (e.g., ligands in biological samples, such as urine, 
vaginal fluid, or in combinatorial libraries) . 

Pheromone receptor fragments used in the methods, when not produced by a transfected 

25 nucleic acid are added to an assay mixture as an isolated polypeptide. The assay can be used to 
screen putative ligands for their ability to bind to the receptor. Pheromone receptor 
polypeptides preferably are produced recombinantly, although such polypeptides may be isolated 
from biological extracts. Recombinantly produced pheromone receptor polypeptides include 
chimeric proteins comprising a fusion of a pheromone receptor protein with another polypeptide. 

30 For example, a polypeptide fused to a pheromone receptor polypeptide or fragment may also 
provide means of readily detecting the fusion protein, e.g., by immunological recognition or by 
fluorescent labeling. 
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In addition to the pheromone receptor, a screening assay mixture includes a binding 
partner for the receptor, e.g., a naturally occurring ligand that is capable of binding to the 
pheromone receptor or, alternatively, is comprised of an analog which mimics the pheromone 
receptor binding properties of the naturally occurring ligand for purposes of the assay. The 
5 screening assay mixture also comprises a candidate pharmacological agent (e.g., a putative 
receptor agonist or antagonist). Typically, a plurality of assay mixtures are run in parallel with 
different agent concentrations to obtain a different response to the various concentrations. 
Typically, one of these concentrations serves as a negative control, i.e., at zero concentration of 
agent or at a concentration of agent below the limits of assay detection. Candidate agents 

10 encompass numerous chemical classes, although typically they are organic compounds. 
Preferably, the candidate pharmacological agents are small organic compounds, i.e., those having 
a molecular weight of more than 50 yet less than about 2500, preferably less than about 1000 
and, more preferably, less than about 500. Candidate agents comprise functional chemical 
groups necessary for structural interactions with polypeptides and/or nucleic acids, and typically 

15 include at least an amine, carbonyl, hydroxyl or carboxyl group, preferably at least two of the 
functional chemical groups and more preferably at least three of the functional chemical groups. 
The candidate agents can comprise cyclic carbon or heterocyclic structure and/or aromatic or 
polyaromatic structures substituted with one or more of the above-identified functional groups. 
Candidate agents also can be biomolecules such as peptides, saccharides, fatty acids, sterols, 

20 isoprenoids, purines, pyrimidines, derivatives or structural analogs of the above, or combinations 
thereof and the like. Where the agent is a nucleic acid, the agent typically is a DNA or RNA 
molecule, although modified nucleic acids as defined herein are also contemplated. 

Candidate agents are obtained from a wide variety of sources including libraries of 
synthetic or natural compounds. For example, numerous means are available for random and 

25 directed synthesis of a wide variety of organic compounds and biomolecules, including 
expression of randomized oligonucleotides, synthetic organic combinatorial libraries, phage 
display libraries of random peptides, and the like. Alternatively, libraries of natural compounds 
in the form of bacterial, fungal, plant and animal extracts are available or readily produced. 
Additionally, natural and synthetically produced libraries and compounds can be readily be 

30 modified through conventional chemical, physical, and biochemical means. Further, known 
pharmacological agents may be subjected to directed or random chemical modifications such as 
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acylation, alkylation, esterification, amidification, etc. to produce structural analogs of the agents. 

A variety of other reagents also can be included in the mixture. These include reagents 
such as salts, buffers, neutral proteins (e.g., albumin), detergents, etc. which may be used to 
5 facilitate optimal protein-protein and/or protein-nucleic acid binding. Such a reagent may also 
reduce non-specific or background interactions of the reaction components. Other reagents that 
improve the efficiency of the assay such as protease, inhibitors, nuclease inhibitors, antimicrobial 
agents, and the like may also be used. 

The mixture of the foregoing assay materials is incubated under conditions whereby, but 

10 for the presence of the candidate pharmacological agent, the pheromone receptor polypeptide 
specifically binds the cellular binding target, a portion thereof or analog thereof. The order of 
addition of components, incubation temperature, time of incubation, and other parameters of the 
assay may be readily determined. Such experimentation merely involves optimization of the 
assay parameters, not the fundamental composition of the assay. Incubation temperatures 

15 typically are between 4°C and 40 °C. Incubation times preferably are minimized to facilitate 
rapid, high throughput screening, and typically are between 0.1 and 10 hours. 

After incubation, the presence or absence of specific binding between the pheromone 
receptor polypeptide and one or more binding targets is detected by any convenient method 
available to the user. For cell free binding type assays, a separation step is often used to separate 

20 bound from unbound components. The separation step may be accomplished in a variety of 
ways. Conveniently, at least one of the components is immobilized on a solid substrate, from 
which the unbound components may be easily separated. The solid substrate can be made of a 
wide variety of materials and in a wide variety of shapes, e.g., microtiter plate, microbead, 
dipstick, resin particle, etc. The substrate preferably is chosen to maximum signal to noise ratios, 

25 primarily to minimize background binding, as well as for ease of separation and cost. 

Separation may be effected for example, by removing a bead or dipstick from a reservoir, 
emptying or diluting a reservoir such as a microtiter plate well, rinsing a bead, particle, 
chromatographic column or filter with a wash solution or solvent. The separation step preferably 
includes multiple rinses or washes. For example, when the solid substrate is a microtiter plate, 

30 the wells may be washed several times with a washing solution, which typically includes those 
components of the incubation mixture that do not participate in specific bindings such as salts, 
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buffer, detergent, non-specific protein, etc. Where the solid substrate is a magnetic bead, the 
beads may be washed one or more times with a washing solution and isolated using a magnet. 

Detection may be effected in any convenient way for cell-based assays such as two- or 
three-hybrid screens. The transcript resulting from a reporter gene transcription assay of 
5 Pheromone receptor polypeptide binding to a target molecule typically encodes a directly or 
indirectly detectable product, e.g., p-galactosidase activity, luciferase activity, and the like. A 
wide variety of cell based assays for G-protein coupled receptors could also be employed for 
detection of molecules that stimulate (agonsists) pheromone receptors or block (agonists) that 
stimulation by natural ligands or agonists. Pheromone receptor polypeptides or chimeric 

10 receptors composed only in-part of a pheromone receptor could be employed in these assays. 
The chimeric receptors might, for example, contain part of another G-protein coupled receptor 
such that binding of a ligand to the pheromone receptor binding domain results in coupling to 
a particular G-protein where activation could be easily assayed. For cell free binding assays, one 
of the components usually comprises, or is coupled to, a detectable label. A wide variety of 

1 5 labels can be used, such as those that provide direct detection (e.g., radioactivity, luminescence, 
optical or electron density, etc), or indirect detection (e.g., epitope tag such as the FLAG epitope, 
enzyme tag such as horseradish peroxidase, etc.). The label may be bound to a pheromone 
receptor binding partner (ligand), or incorporated into the structure of the binding partner. 

A variety of methods may be used to detect the label, depending on the nature of the label 

20 and other assay components. For example, the label may be detected while bound to the solid 
substrate or subsequent to separation from the solid substrate. Labels may be directly detected 
through optical or electron density, radioactive emissions, nonradioactive energy transfers, etc. 
or indirectly detected with antibody conjugates, strepavidin-biotin conjugates, etc. Methods for 
detecting the labels are well known in the art. 

25 The invention provides pheromone receptor -specific binding agents, methods of 

identifying and making such agents, and their use in diagnosis, therapy and pharmaceutical 
development, including the development of pesticides and other agents for controlling fertility 
and reproduction (or related behaviors) in animals. For example, pheromone receptor-specific 
pharmacological agents are useful in a variety of diagnostic and therapeutic applications, 

30 especially where disease or disease prognosis is associated with improper utilization of a 
pathway involving pheromone receptor. Novel pheromone receptor-specific binding agents 
include pheromone receptor-specific antibodies and other natural intracellular binding agents 
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identified with assays such as two hybrid screens, and non-natural intracellular binding agents 
identified in screens of chemical libraries and the like. 

In general, the specificity of pheromone receptor binding to a binding agent is shown by 
binding equilibrium constants. Targets which are capable of selectively binding a pheromone 
receptor polypeptide preferably have binding equilibrium constants of at least about 10 7 M'\ 
more preferably at least about 10 8 M" 1 , and most preferably at least about 10 9 M* 1 . The wide 
variety of cell based and cell free assays may be used to demonstrate pheromone receptor - 
specific binding. Cell based assays include one, two and three hybrid screens, assays in which 
pheromone receptor -mediated transcription is inhibited or increased activation of G-proteins, 
etc. Cell free assays include pheromone receptor -protein binding assays, immunoassays, etc. 
Other assays useful for screening agents which bind pheromone receptor polypeptides include 
fluorescence resonance energy transfer (FRET), and electrophoretic mobility shift analysis 
(EMSA). 

Various techniques may be employed for introducing nucleic acids of the invention into 
cells, depending on whether the nucleic acids are introduced in vitro or in vivo in a host. Such 
techniques include transfection of nucleic acid-CaP0 4 precipitates, transfection of nucleic acids 
associated with DEAE, transfection with a retrovirus including the nucleic acid of interest, 
liposome mediated transfection, and the like. For certain uses, it is preferred to target the nucleic 
acid to particular cells. In such instances, a vehicle used for delivering a nucleic acid of the 
invention into a cell (e.g., a retrovirus, or other virus; a liposome) can have a targeting molecule 
attached thereto. For example, a molecule such as an antibody specific for a surface membrane 
protein on the target cell or a ligand for a receptor on the target cell can be bound to or 
incorporated within the nucleic acid delivery vehicle. For example, where liposomes are 
employed to deliver the nucleic acids of the invention, proteins which bind to a surface 
membrane protein associated with endocytosis may be incorporated into the liposome 
formulation for targeting and/or to facilitate uptake. Such proteins include capsid proteins or 
fragments thereof tropic for a particular cell type, antibodies for proteins which undergo 
internalization in cycling, proteins that target intracellular localization and enhance intracellular 
half life, and the like. Polymeric delivery systems also have been used successfully to deliver 
nucleic acids into cells, as is known by those skilled in the art. Such systems even permit oral 
delivery of nucleic acids. 



WO 99/00422 



.41 - 



PCT/US98/13680 



Examples 

Example 1 

Experimental Procedures 
5 Preparation and analysis of single cell cDNAs 

Male mouse (C57BL/6J) VNOs were minced, incubated in Trypsin-EDTA (Gibco- 
BRL/LTI, Rockville, Maryland), and triturated to obtain dissociated cells. The cells were 
centrifiiged (1000 RPM, 5 min) and resuspended in phosphate buffered saline + 0.1% bovine 
serum albumin. Individual cells that appeared to be neurons were transferred to separate tubes 

10 with a microcapillary pipet. 

cDNAs were prepared from each cell and amplified according to Brady and Iscove 
(Methods in Enzymology, 1993, 225:611-621) with minor modifications. Briefly, cDNAs were 
prepared from the 3' ends of mRNAs by reverse transcription with an oligo (dT) primer, and a 
poly dA stretch was added to each cDNA with terminal transferase. The cDNAs were then 

15 amplified by PCR with one of two primers, AL1 (ATTGGATCCAGGCCGCTCTGGACAA 
AATATGAA TTC(T) ( SEQ. ID. No. 56) (Dulac and Axel, Cell, 1995, 83:195-206 or AL3 
(GGCACATGG ACGAAATCTTGGTACTCTTCAGAATTC(T), (SEQ. ID. No. 57) and Taq 
polymerase [Amplitaq LD ("ALD") or Amplitaq Stoffel Fragment ("ASF") (Perkin Elmer, 
Norwalk,CT)]. 

20 Aliquots of each cDNA sample were electrophoresed on agarose gels and blotted onto 

nylon membranes (Hybond N\ Amersham, Piscataway, NJ) (Ausubel, F., et al., Current 
Protocols in Molecular Biology, 1988, John Wiley & Sons NY, NY; Sambrook, J., et al., 
Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory 
Press, 1989). The blots were hybridized at 55° or 70°C in Hyb Buffer (0.5M sodium phosphate 

25 buffer (pH7.3), 4% SDS, 1% bovine serum albumin (BSA)) with 32 P-labeled probes prepared by 
random priming (Prime-It II, Stratagene, La Jolla, CA). 

Construction and screening of single cell cDNA libraries 

An aliquot of cDNA sample VN14 was digested with Eco RI and gel-isolated fragments 
30 of 0.1-1.5 kb were cloned into XZapII Ausubel, F., et al., Current Protocols in Molecular 
Biology, 1988, John Wiley & Sons NY, NY; Sambrook, J., et al., Molecular Cloning: A 
Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, 1989). Two 
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thousand library clones were plated at low density. Replica filter lifts were hybridized at 75°C 
(in Hyb Buffer containing 2^tg/ml poly (dT)24 and 1 jig/ml of random dA-dT 20-mers) to 32 P- 
labeled probes (-2.5 x 10 8 CPM/^g; 5 x 10 6 CPM/ml) prepared by PCR of different single cell 
cDNA samples. Clones that hybridized to only a VN14 probe were isolated, and a probe 
5 prepared from the insert of each was hybridized to blots of selected single cell cDNAs. Clones 
that hybridized to only VN14 cDNAs were sequenced. 

Isolation and analysis of VR cDNA clones 

scl53, one VN14 + VN2" clone from the VN14 library, was used as probe to screen a 
10 mouse VNO cDNA library ('A.VNO') (Berghard, A., et al., JNeurosci, 1996, 16:909-918) and 
a mouse genomic DNA library (Stratagene, La Jolla, CA) (70°C, Hyb buffer). Hybridizing 
clones were found only in the genomic library. A fragment containing 2kb upstream of scl53 
was isolated from one genomic clone (1 53G1) and used to screen 1VNO (55°C, Hyb Buffer). The 
region (D10-TM7) of one clone (D10) that showed homology to TM7 of the CSR (SEQ ID NO. 
1 5 59) was then used to screen 1VNO (55°C, Hyb Buffer), yielding a variety of VR cDNA clones. 
Additional clones were obtained from 1VNO using probes prepared from clones previously 
isolated, or from PCR products obtained by amplification of mouse genomic DNA or VNO 
cDNA with degenerate primers (Buck, L., et al., Cell, 1991, 65:175-187) matching conserved 
motifs in the VRs. Some PCR products were also cloned into pCR2.1 (Invitrogen, Carlsbad, 
20 CA) and sequenced. 

Analysis of VR mRNAs by RT-PCR 

Random-primed cDNA prepared from male or female C57BL/6J mouse VNO RNAs (or 
VR cDNA clones) were used in PCR reactions with degenerate primers (Buck and Axel, Cell 

25 1991,65:175-187) matching conserved VR motifs to amplify VR sequences corresponding to 
amino acids 33-772 in VR1 (SEQ ID NO. 2). Nested PCR was performed with a 1/1000 dilution 
of the first PCR reaction and primer pairs matching regions of putative exons 1 and 6 in specific 
VR cDNA clones. Blots prepared from size-fractionated, nested PCR products were hybridized 
(70°C, Hyb buffer containing lOOjig/ml herring sperm DNA (Sigma, St Louis, MO)) to probes 

30 prepared from the PCR products of the cDNA clones. 



Northern and S uthern blots and gen mic library screens 
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Northern Blots: One jig of PolyA* RNA prepared from mouse VNO and OE, or 
purchased from Clontech (other tissue RNAs), was size fractionated on formaldehyde gels, and 
blotted (see above) (Berghard and Buck, J Neurosci, 1996, 16:909-918). The blot was 
hybridized (70°C, Hyb Buffer) with a 32 P-labeled probe prepared from the regions of cDNAs 
5 VR1 , VR2, VR4, and VR1 5 corresponding to that encoding amino acids 33-772 in VR1 (SEQ 
ID NO. 1). 

Southern Blots: 5 jig of genomic DNA prepared from C57BL6/J mouse liver was 
digested with Eco RI or Hind HI, size fractionated, and blotted (Ressler et al, Cell, 1993, 73:597- 
609). The blots were hybridized (70°C, Hyb buffer containing sperm DNA (see above)) to 
10 probes prepared from 3' untranslated segments of different VR cDNA clones [VR2 (nt.2607- 
2961 of SEQ ID NO. 3), VR3 (nt. 2505-2907 of SEQ ID NO. 5), and VR15 (nt. 3239-3689 of 
SEQ ID NO. 29)]. A VR4 probe was also used, which gave the same results as highly related 
VRlSprobe. 

Genomic library screens to determine VR gene number: A mouse genomic library was 
15 screened separately at 70°C or 55°C (see above) with different 32 P-labeled probes. Probe 1 : a 
mix of segments of cDNAs VR1 (SEQ ID NO. 1), VR2 (SEQ ID NO. 3), VR4 (SEQ ID NO. 7), 
and VR15 (SEQ ID NO. 29) encoding the region corresponding to amino acids 619-772 of VR1 
(SEQ ID NO. 2). Probes 2-6: Segments ofVR genes obtained from mouse genomic DNA by 
PCR with degenerate primers matching conserved VR sequence motifs. The PCR segments 
20 corresponded to the following amino stretches in VR1 (SEQ ID NO. 2): amino acids 191-397, 
565-825, 637-825, 637-804, and 619-784. For example, degenerate oligonucleotide primer pairs 
used included: 

for amino acids 191-397: 
5' primer= (GCT)H(CT)A(CT) CA(AG)(AG)TIGCI(AC)CIAA(AG)GA(CT)AC (SEQ ID NO. 
25 60), 

y primes G(CT^(AG)T(GT)IGCI(AG)(CT)I(AG)C(AG)T(AG)IACI(AG)C(AG)TT (SEQ ID 
NO. 61); 

for amino acids 565-825: 
5' primer= (AC)(AG)ITG (CT)CCI(GT)AnA(CT)(AC)A(AG)TA(CT)GCIAA (SEQ ID NO. 62), 

30 

3' primer= GIC(GT)IA(CT)IA(AG)IATIA (CT)(AG)TAI(AC)(AT)(CT)TTIGGIAC (SEQ ID 
NO. 63); 
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for amino acids 637-825: 
5* primer= ATI(AT)(GC)I (CT) TI(AG)TITT(CT^TG(CT^TT(CT)(CT)TITG (SEQ ID NO. 64), 
3' primer= GIC(GT)IA(CT)IA(AG)IATIA (CT)(AG)TAI(AC)(AT)(CT)rnGGIAC (SEQ ID 
NO. 63); 

for amino acids 637-804: 
5' primer= ATI(AT)(GC)I(CT)TI(AG)TlTT(CT)TG(CT)TT(CT)(Cr)TITG (SEQ ID NO. 64), 
3' primer= (AG)IATI(GC)(AT)(AG)AAIA(CT)(COTCIACI (AG)CIACCAT (SEQ ID NO. 65); 
and 

for amino acids 619-784: 
5' primer= GA(CT)ACICCIATIGTIAA(AG)GCIAA(CT)AA (SEQ ID NO. 66), 
3 f primer= AAIGTIA(CT)CCAIACI(GC)(AT)(AG)CA(AG)AAIAC (SEQ ID NO. 67), wherein 
all primers are in a 5 f -3' direction, I:Inosine. 

In situ hybridization 

In situ hybridization was performed according to Schaeren-Wiemers and Gerfin-Moser 
(Histochemistry, 1993, 100:431-440) with sequential 16 micron sections of male or female 
VNOs. Digoxigenin- labeled cRNA probes were prepared from the same 3' untranslated regions 
of VR cDNAs as used for the genomic Southern blots. Sections were counter-stained with 
Hoechst 33258, which labels nuclei. The numbers of G^ or G^-labeled cells (or cells labeled 
with VR probes) was determined by counting the number of nuclei in labeled regions. The total 
number of cells was considered. to be the sum of G M + and G^n- cells in adjacent sections. 

Chromosome mapping of VR genes 

Southern blots of genomic DNA from C57BL/6J and Mus spretus (Jackson Labs) 
digested with different restriction enzymes were prepared and probed with specific VR cDNA 
probes as described above. Southern blots of Eco RI, size fractionated genomic DNAs from 94 
different backcross mice (M. spretus x (M. spretus x C57BL/6J)), were purchased from Jackson 
Labs. These blots were hybridized to probes prepared from 3* untranslated segments of the VR2 
or VR4 (see above) cDNA at 70°C and washed (see above). Polymorphic bands were typed as 
either M. spretus or M. spretus/C57BL/6J. The data was sent to the Jackson Laboratory 
Backcross DNA Mapping Panel Resource for determination of the chromosomal locations of the 



WO 99/00422 PCT/US98/13680 

-45- 

polymorphic fragments. Additional information was obtained via internet from Jackson 
Laboratory Mouse Genome Informatics. 

Cloning of a gene differentially expressed in G M + VNs 
5 Different members of the OR and VNR families are expressed in different neurons in the 

OE and G&+ zone of the VNO, respectively. It therefore appeared likely that the same would 
be true of sensory receptors expressed by VNs. The differential screening of cDNA libraries 
with cDNA probes prepared from a few neurons can be used to identify genes expressed in one 
neuron, but not another (Buck, L., et al, Annu. Rev. NeuroscL, 1996, 19:517-544). Using PCR, 

10 this can be accomplished with single cells (Brady, G., et al., Methods in Enzymology, 1993, 
225:611-621; Dulac, C, etal., Cell, 1995, 83:195-206). 

To search for genes encoding receptors expressed by G w + VNs, we looked for genes 
expressed in one G M + VN, but not another, using the PCR-based differential screening approach. 
In initial experiments, we isolated a series of mouse VNs, prepared cDNAs from the 3* ends of 

15 mRNAs present in each, and amplified the single-cell cDNA fragments by PCR. Many of the 
amplified, single-cell cDNA samples hybridized to an OMP probe, confirming their derivation 
from VNs (Berghard et al, Proc, Natl Acad Set USA, 1996, 93:2365-2369). With one 
exception, G^ and G^ probes hybridized to different OMP+ samples, allowing us to identify 
samples that were derived from G^ VNs. 

20 We next prepared a library from one of the G w + single-cell cDNA samples (VN14), and 

isolated clones that hybridized to a probe prepared from VN14, but not to a probe prepared from 
another G M + sample (VN2). We identified 3 VN14+VN2- clones, which differed in size, but 
were otherwise identical in sequence. None contained an open reading frame, which was not 
surprising since, in the method used, the amplified cDNAs are only -400-800 bp long, and are 

25 derived from the 3' ends of mRNAs (Brady and Iscove, Methods in Enzymology, 1993, 225:61 1- 
621). 

We next hybridized one of the VN14+VN2- clones (scl53) to the original panel of 
single-cell cDNAs. scl53 hybridized to VN14, but not to any of the other cDNA samples. 
Consistent with this result, scl53 hybridized to only a small percentage (-0.3%) of VNs in VNO 
30 tissue sections. 

Using scl53 as probe, we were able to isolate a scl53+ clone from a mouse genomic 
library which contained -2 kb of DNA 5* to the scl53 sequence. Using this 2kb fragment as 
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probe, we isolated a matching clone (D10) from the VNO cDNA library. Sequence analysis 
showed that scl53 and D10 were derived from the same gene, but that the D10 cDNA was 
truncated at the 3* end and did not contain the final 685 bp of sequence present in scl53. Like 
scl53, D10 hybridized to only a small percentage of VNs in VNO tissue sections. 
5 The 5* end of the D10 cDNA contained a short open reading frame, which encoded a 

protein fragment with homology to transmembrane domain 7 (TM7) of the calcium sensing 
receptor (CSR), a G protein-coupled receptor (GPCR) (Brown et al, Nature, 1993, 366:575-580). 
When the TM7-related region of D10 (D10-TM7) was hybridized at reduced stringency (55°C) 
to the original panel of single-cell cDNAs, it labeled many of the G M + samples, but none of G^ 
10 ones (except the one that was also G«,+, and was probably derived from two cells). Since D10 
labeled only a small percentage of VNs in tissue sections under high stringency conditions, this 
suggested that many G M + neurons express a gene related to D10, but not identical to it. 

A novel multigene family encoding VNO receptors 

15 Hybridization of D10-TM7 to the VNO cDNA library at reduced stringency yielded a 

number of related cDNA clones (e.g. VR1-VR3, SEQ ID NOs. 1-6). Additional related cDNAs 
were obtained by RT-PCR with degenerate primers (e.g. VR6-VR7, SEQ ID NOs. 11-14), or 
by screening the VNO cDNA library with a PGR product obtained from genomic DNA (e.g., 
VR4, VR5, SEQ ID NOs. 7-10). 

20 These cDNAs encode a novel family of proteins, which are members of the G protein- 

coupled receptor (GPCR) superfamily (Figure 1). Like other GPCRs, these VNO receptors 
(VRs) have 7 hydrophobic stretches that may serve as membrane spanning domains. Only 287 
of 850 residues are identical in all of the molecules shown in Figure 1, indicating that the family 
is diverse. The VRs are related to two other types of GPCR, the calcium sensing receptor (CSR) 

25 and the metabotropic glutamate receptors (mGluRs) (Tanabe, Y., et al., Neuron, 1992, 8:169- 
179; Brown, E., et al., Nature, 1993, 366:575-580). The most highly related molecule is the 
CSR; for example, VR1 is 3 1 % identical to rat CSR (Riccardi et al., Proc. Natl Acad Sci. USA, 
1995, 92:13 1-135), with the highest homology residing in the TM1-TM7 region (44%) (Figure 
1). However, the VRs comprise a distinct family of receptors, which share novel sequence 

30 motifs, and are more related to one another than they are to other receptors. For example, two 
divergent VRs, VR1 (SEQ ID NO. 1, 2) and VR4(SEQIDNO. 7, 8), are 70% identical in TM1- 
TM7, and 48% identical overall. 
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The VRs are unusual among GPCRs in having an extremely long N-terminal extracellular 
domain (Figures 1 and 2). This feature is shared by the CSR and mGluRs, and by an unrelated 
class of GPCRs that includes several receptors for glycoprotein hormones (SegalofF, D., et al., 
Oxf. Rev. Reprod Biol, 1992, 14:141-168). Importantly, the VRs are very different from both 
5 ORs and VNRs, which are also GPCRs (Buck. L., et al., Cell, 1991 51:127-133; Dulac, C, et 
al., Cell, 1995, 83:195-206). VRs share none of the characteristic sequence motifs of ORs or 
VNRs. In addition, the size of the N-terminal extracellular domain of VRs (557-565 amino 
acids) far exceeds that of ORs and VNRs (-12-28 amino acids) (Figure 2). The VRs are most 
variable in the N-terminal domain (25% identical residues compared to 57% in TM1-TM7). In 

10 the structurally-related mGluRs, the ligand binding site is thought to reside in the large N- 
terminal domain (O'Hara et al., Neuron, 1993, 1 1 :41-52; Takahashi et al, J. Biol Chem., 1993, 
268:19341-19345). If this is also true of VRs, the accentuated diversity of the N-terminal 
domain may reflect an ability to recognize diverse pheromonal ligands. 

Most of the VR cDNAs that we analyzed appeared to belong to one of three subfamilies 

15 of highly related molecules. For example, VR1 (SEQ ID NOs. 1, 2), VR2 (SEQ ID NOs. 3, 4), 
and VR3 (SEQ ID NOs. 5, 6) are very similar as are VR4 (SEQ ID NOs. 7, 8) and VR5 (SEQ 
ID NOs. 9, 10), and VR6 (SEQ ID NOs. 11,12) and VR7 (SEQ ID NOs. 13, 14) (Figure 1). 
Nonetheless, our results indicate that all of these cDNAs were derived from different genes. 
First, all cDNAs were sequenced on both strands to rule out sequencing errors. Second, the RNA 

20 used for library construction and PCR came from an inbred mouse strain (C57BL/6J), so they 
cannot be allelic variants. Third, the error rates of reverse transcriptase (or Taq polymerase) 
cannot account for the extent to which the cDNAs differ. For example,VR4 (SEQ ID NOs. 7, 
8) and VRS (SEQ ID NOs. 9, 10) cDNAs are 99% identical in nucleotide sequence, but the 
reverse transcriptase used to prepare them has an error rate of only 3.6 x 10* 5 /bp (Ji, J., et al., 

25 Biochemistry,l992, 3 1 :954-958). 

Variant forms of VR mRNA 

Many of the VRs we characterized lacked a segment of the N-terminal domain present 
in other VRs. Invariably, the missing segment corresponded to a region of the human CSR 
30 encoded by a single exon, or pair of exons (Pollak, M., et al., Cell, 1993, 73:1297-1303). We 
also found several different VR cDNAs that contained a stretch of noncoding sequence at a site 
corresponding to a CSR exon-intron boundary (e.g. VR15). This suggested that the exon-intron 
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structure of VR genes resembles that of the CSR gene, and that variant forms of VR mRNAs 
might be generated by differential RNA splicing. 

Variant VR mRNAs could derive either from different genes, or from the same gene by 
alternative RNA splicing. Consistent with the latter possibility, two pairs of cDNAs that we 
5 sequenced VR8 (SEQ ID NOs. 15, 16) and VR9 (SEQ ID NOs. 17, 18), and VR10 (SEQ ID 
NOs. 19, 20) and VR1 1 (SEQ ID NOs. 21, 22) were identical in nucleotide sequence, but were 
missing different segments. However, when we used RT-PCR to amplify VNO mRNA 
sequences encoding 5 different VRs, we obtained one major PCR product in each case, 
regardless of whether the RNA used was from male or female mice. In 4 cases, the size of the 

10 major product corresponded to a complete VR, even though one of the cDNAs (but not the PCR 
product) contained an intron (#5). In one case, in which the cDNA lacked one exon (#2), the 
major PCR product was even smaller, and was found to lack two exons. Although PCR products 
of a smaller size were also seen in these experiments, they were much less abundant. 

These results suggest that different VR forms derive from different genes. Thus many 

15 VR genes may be expressed pseudogenes, which either lack one or more exons, or have 
mutations that prevent proper RNA splicing. We cannot exclude the possibility that some variant 
VRs are functional, however. For example, some truncated VRs that lack transmembrane 
domains could conceivably be secreted pheromone-binding proteins. 

20 Differential expression of VR genes in VNO neurons 

To investigate the tissue distribution of VR gene expression, we conducted Northern blot 
analyses in which size fractionated poIyA* RNAs from different mouse tissues were hybridized 
to a mix of radiolabeled VR cDNAs. The mixed probe hybridized to VNO RNAs of -1.9-3.7 
kb, with intense hybridization to RNAs of 2.8-3.5 kb. It did not hybridize to RNAs from a 

25 variety of other tissues, including olfactory epithelium and brain. This suggested that VR genes 
may be expressed exclusively in the VNO. 

We found two partial cDNAs that were highly related to VR cDNAs in the NCBI dbEST 
database, one from spleen and the other from 2-cell stage mouse embryos. However, when we 
hybridized the most highly related VR cDNAs (VR6 and VR7) to spleen sections, only one 

30 questionably-labeled cell was seen out of ~1 .4 x 10 6 cells with one VR probe, and none was seen 
with the other. The EST clones might be DNA contaminants, or be due to the widespread, but 
low level, misexpression of tissue specific genes (Sarkar, G., et al., Science, 1989, 244:331-334); 
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nonetheless, we cannot exclude the possibility that VR genes are expressed at a low frequency 
in some other tissues. 

To examine the patterns of expression of different VR genes in the VNO, we conducted 
in situ hybridization experiments. Labeled segments of the 3' untranslated regions of three VR 
5 cDNAs were hybridized, separately, or in combination, to sequential sections through the VNO. 
Probes prepared from and cDNAs were hybridized to adjacent sections to delineate the 
G M + and G^* zones of the VNO neuroepithelium. 

The G^ and G^ probes gave patterns of hybridization similar to those we had previously 
seen (Berghard, A., et al, J, Neurosci. , 1 996, 1 6:909-91 8). The G^ probe hybridized to a wavy 

10 stripe of VNO neurons in the basal (lower) region of the VNO neuroepithleium, whereas the G ai2 
probe hybridized to an adjacent stripe of neurons in the apical (upper) part of the 
neuroepithelium. The waviness of the two zones appears to be caused by the periodic presence 
of blood vessels near the base of the epithelium (Berghard, A., et al, J, Neurosci. , 1996, 16:909- 
918). Approximately 57% of VNs were labeled by the G^ probe and 43% were labeled by the 

15 G m probe. The single layer of supporting cells located just beneath the epithelial surface was not 
labeled by either probe. 

Each of the VR probes hybridized to a small percentage (2.4-5.7%) of VNs that appeared 
to be restricted to the basal, G„+ zone of the VNO neuroepithelium. Labeled neurons were 
scattered throughout the anterior-posterior and dorsal-ventral extent of the G W H- zone. Small 

20 clusters of labeled cells were somtimes seen, particularly with the VR2 probe The mixed probe 
labeled a larger percentage of VNs (10.6%) that was almost equal to the sum of the percentages 
labeled by its individual components (10.8%). Thus different G w + neurons must express 
different VRs. 

No differences were seen in the patterns of hybridization obtained using VNOs from male 
25 and female mice, and no hybridization was observed in the nasal olfactory epithelium using 
either the mix of VR probes or a full-length VR cDNA probe (not shown). Subsequent analyses 
of the size of the VR gene family, and the number of VR genes recognized by the VR in situ 
hybridization probes, allowed us to estimate the number of VR genes expressed by individual 
neurons (see below). 

30 

The size f the VR multigen family 
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To investigate the size of the VR gene family, we hybridized several different mixed VR 
gene probes to a mouse genomic library, using high (70°C) or low (55°C) stringency conditions. 
A probe prepared from the membrane spanning regions (putative exon 6) of several different 
cDNA clones hybridized to 59 and 98 clones per haploid genome equivalent, at high and low 
5 stringency, respectively. To obtain probes that were potentially more diverse, we amplified 
internal segments of putative exon3 or 6 from genomic DNA by PCR with degenerate primers. 
At high stringency, these probes hybridized to 60-140 clones per haploid equivalent. These 
results indicate that there are as many as 140 VR genes in the mouse genome. 

The VR probes that we used for in situ hybridization each labeled a small percentage of 
10 neurons. To determine how many VR genes each probe recognized, we hybridized probes 
prepared from the same VR cDNA segments to Southern blots of C57BL/6J mouse genomic 
DNA which had been digested with Eco RI or Hind III. Each probe hybridized to a small 
number of restriction fragments. Given the small size of the probes (-350-450 bp), most of these 
fragments should represent at least one gene, provided that there are no introns in the region 
15 probed. Consistent with this assumption, the VR2 (SEQ ID NO. 3) probe hybridized to 7 
different restriction fragments, as many as five of which could be accounted for by characterized 
VR cDNAs that were 91-98% identical to VR2 (SEQ ID NO. 3) in the region probed. 

Given the number of genes recognized by each VR probe and the percentage neurons 
that hybridized to each, we estimate that each VR gene may be expressed in only -1.1-1 .9% of 
20 G.O+ VNs. Since there appear to be 60-140 VR genes in the mouse genome, this suggests that 
each G^-f- VNO neuron may express only one, or at most a few, VR genes. 

Linkage of chromosomal clusters of VR and OR genes 

We previously found that there are clusters of OR genes at multiple chromosomal sites 
25 in the mouse genome (Sullivan, S., et al., Proa Natl Acad ScL, 1996, 93:884-888). To 
investigate the chromosomal locations of VR genes, we used the Jackson Laboratory Backcross 
DNA Mapping Panel, which allows the mapping of mouse genes using interspecies mouse 
crosses. 

Probes prepared from the 3* untranslated regions of VR2 (SEQ ID NO. 3) or VR4 cDNAs 
30 were first hybridized to Southern bloiS of genomic DNAs from two mouse species, C57BL/6J 
and Mus spretus, which had been digested with different restriction enzymes. Eco RI digests 
showed a number of restriction length polymorphisms with both VR probes. The VR probes 
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were then hybridized to Eco Rl-digested DNAs from a large panel of different backcross mice 
((C57BL/6J x M. spretus) x M. spretus). 

The patterns of inheritance of the polymorphic fragments recognized by the two VR 
probes allowed us to assign chromosomal locations to approximately 9 VR genes. Using the 
5 VR4 (SEQ ID NO. 7) probe, we could follow the inheritance of 4 polymorphic restriction 
fragments. All of these cosegregated in the backcrosses, and mapped to the proximal end of 
chromosome 7 (near D7Bir5). Five restriction fragments were followed for the VR2 (SEQ ID 
NO. 3) probe. Again, all of the restriction fragments cosegregated, allowing us to map the VR2 
(SEQ ID NO. 3) fragments to the distal end of chromosome 4 (near D4Birl). Given the 

1 0 resolution of the genetic mapping, the cosegregating fragments can be no more than 3 .8 cM from 
one another. These results indicate that VR genes are located near the ends of at least two 
different mouse chromosomes. They also indicate that highly related VR genes are clustered at 
the same chromosomal locus, as previously seen in our studies and others (Ben-Arie et al, 
Human Molecular Genetics, 1994, 3:229-235.). 

15 The VR4 gene subfamily appears to be closely linked to one OR gene locus, (ol/RS J 

(Sullivan, S., et al., Proc. Natl. Acad ScL, 1996, 93:884-888). Although the VRs and ORs were 
mapped in different mouse crosses, the synaptotagmin-3 gene (Syt3 ) was mapped in both 
crosses, allowing an estimate of their relative positions. The OR locus mapped 15.05 cM 
proximal to Syt3 while the VR4 gene cluster mapped 14.89 cM proximal to Syt3. (Jackson 

20 Laboratory Mouse Genome Informatics), suggesting a close linkage between VR and OR genes 
at the proximal end of chromosome 7. Our previous studies indicate that multiple OR gene loci 
arose via a series of duplications of very large chromosomal domains that maintained linkages 
between OR genes and members of other gene families. These results therefore suggest that VR 
genes and OR genes might have been linked in a primitive ancestor. They also suggest the 

25 possibility that additional clusters of VR genes might be linked to other OR gene loci. 

Example 2 

Experimental procedures 

Preparation of cDNA Libraries from Isolated VNO Neurons 

30 VNOs were dissected from adult (7- to 8-week-old) male Lewis rats (Sprague-Dawley). 

Single-cell cDNA synthesis and amplification were performed and checked according to Dulac 
and Axel (Ce/7,1995, 83:195-206). Southern blot analysis of single-cell cDNA was used to 
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detect expression of tubulin, OMP, Go, and Gi 2a (Dulac and Axel, Cell, 1995, 83:195-206). 
Eighteen cDNAs showed strong hybridization with tubulin and OMP probes, indicating that they 
originated from mature neurons, and were selected for further study. Cells VN3 and VN13 
exhibited high levels of Go expression, whereas VN10 showed presence of Gi 2a , indicating the 
5 origin of these cells from two distinct regions of the VNO neuroepithelium. VN13 single-cell 
cDNA library was prepared according to Dulac and Axel (Cell, 1995, 83:195-206). 

Differential Screening of Single-Cell Library 

Plaque-forming units (12 x 10 3 ) from the VN13 library were plated at low density, and 
1 0 duplicate filters (Hybond N + , Amersham) were hybridized with probes generated from VN 1 0 and 
VN13 single-cell cDNAs, following the procedure described in Dulac and Axel, Cell, 1995, 
83 : 1 95-206. Ten phage plaques were detected that showed a positive signal unique to the VN 1 3 
probe. These plaques were purified, and the corresponding phage inserts were amplified by PCR, 
run on 1.5% agarose gel, blotted onto nylon filter, and hybridized with the VN10, VN3, and 
15 VN13 single-cell cDNA probes. 

Isolation and Analysis of Fuil-Length cDNA Clones 

A 425 bp clone, Go-VN13A, present at the frequency of 0.1% in the VN13 single-cell 
cDNA library, was selected and in vivo excised to generate the pBlueScriptSK(-) phagemid. 

20 High stringency (65 °C) screening of a cDNA library prepared from female rat VNO (Dulac and 
Axel, Cell 1995, 83:195-206) with the Go-VN13A cDNA probe led to the isolation of 
Go-VN13B (SEQ ID NO. 49) , presenting 90% sequence homology with Go-VN13A. Phages 
(7.2 x 10 s ) of the female rat VNO library were further screened with the Go-VN13B (SEQ ID 
NO. 49) cDNA probe under low stringency conditions: hybridization was carried out at 55 °C for 

25 24 hr, and the filters were washed three times at 55 °C for 30 min in 0.5x SSC and 0.5% SDS. 
A total of 75 positive phages were identified and the corresponding inserts were amplified by 
PCR and analyzed by Southern blot using the Go-VN13B (SEQ ID NO. 49) probe at both high 
(65 °C) and low (55 °C) stringency. This led to the identification of 22 cDNA clones with insert 
sizes longer than 3 kb. Among those, six distinct subfamilies were defined by absence of 

30 cross-hybridization under stringent conditions of hybridization and washing. Full-length clones 
(Go-VNl to G0-VN6, SEQ ID NOs. 33, 35, 37, 39, 41, 43), each representative of a subfamily, 
were selected for in vivo excision and sequenced. Go-VN13C (SEQ ID NO. 47) and Go-VN13B 
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(SEQ ID NO. 49) are identical sequences differing by a 150 bp deletion in Go-VNl 3C (SEQ ID 
NO. 47). This sequence encodes for 

NMDQCANCPEYQYANTEKNKCIQKGVIVLSYEDPLGMALALIAFCFSAF (SEQ ID 
NO. 58) in Go-VN13B (SEQ ID NO. 49) and is replaced by an M at position 552 in Go-VN13C 
5 (SEQ ID NO. 48). 

DNA Sequencing and Sequence Analysis 

DNA sequencing was performed using ABI Prism dye terminator cycle ready reaction 
(Perkin Elmer, Norwalk, CT ) according to manufacturer's protocol. Samples were run on an ABI 
10 Prism 310 Genetic Analyzer (Perkin Elmer, Norwalk, CT). Sequence homologies were 
determined using the BLAST system (NTH network service). Pairwise and Clustal W alignments 
(BLOSUM30 matrix setting) as well as Kyte-Doolittle hydropathic analysis were obtained with 
the Mac Vector sequence analysis software (Oxford Molecular Group). 

15 In Situ Hybridization Analysis 

In situ hybridization was performed as described elsewhere (Schaeren-Wiemers, N., et 
al., Histochemistry, 1993, 100:431-440). VNOs were dissected from adult male (8- to 
9- week-old), adult female (9- to 1 1 -week-old), and young (1 -week-old) rats. Tissues were 
embedded in Tissue-Tek OCT. Antisense and sense digoxigenin-labeled probes were generated 
20 from the full-length cDNAs encoding for Go, Gi^ Go-VN13B (SEQ ID NO. 49), and Go-VNl 
to G0-VN6 (SEQ ID NOs. 33, 35, 37, 39, 41, 43), as well as from the 3' untranslated regions of 
the Go-VNl to G0-VN6 clones. 

Imaging Processing and Statistical Analysis 

25 Digital photographs were captured with a Leitz DMRB microscope (Leica) coupled to 

a ProgRes30 12 digital camera (Kontron Electronic) and further processed with the Photoshop 
(Adobe System) and Canvas (Deneba) software for Macintosh. The relative positions of cells 
exhibiting a positive signal by in situ hybridization were measured along the basal-apical axis 
using the NIH Image analysis software. The number of cells in hemiconcentric sections of 10% 

30 along this axis from the basal (value = 0) to the apical (value = 1 00) boundaries was determined. 
Average data for Go-VNl and Go-VN3 to G0-VN6 were obtained from six to eight VNO 
sections, corresponding to four individuals analyzed in two independent experiments. For 
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Go-VN2, 14 VNO sections, corresponding to ten individuals and four independent experiments, 
were analyzed for each sex. 

Southern Blot Analysis of Rat Genomic DNA and Screening of Rat and Human Genomic 
5 Libraries 

Genomic DNA, prepared from Lewis rat (Sprague-Dawley) liver, was digested with the 
restriction enzymes EcoRI and BamHI, size fractionated on 0.8% agarose gels, and blotted onto 
nylon membrane (Sambrook, J., et al., Molecular Cloning: A Laboratory Manual, Second 
Edition, Cold Spring Harbor Laboratory Press, 1989). Membranes were cross-linked under UV 

10 light, hybridized overnight at both high (68°C) and low (55°C) stringency in hybridization 
buffer, and washed as described above; 32 P-labeled probes were generated by random priming, 
using the following DNA templates: EcoRI-EcpRV, Notl-Nsil, EcoRI-Sall, Pstl-Ndel, 
Xbal-HincII, and EcoRI-Nsil fragments of Go-VNl to G0-VN6 (SEQ ID NOs. 33, 35, 37, 39, 
41, 43), respectively; a full-length (425 bp) insert of Go-VNl 3 A; and a cDNA fragment 

1 5 including the seven transmembrane domains of Go-VNl 3B (SEQ ID NO. 49). Plaque-forming 
units (3 x 10 s ) from rat and human genomic libraries (Stratagene, La Jolla, CA) were screened 
at low stringency (55°C) using a mix of 32 P-labeled probes prepared from fragments of Go-VNl 
to G0-VN6 (SEQ ID NOs. 33, 35, 37, 39, 41, 43) encompassing the transmembrane domains 2 
to 7. 

20 

Results 

The VNO Neuroepithelium Expresses Two Independent Families of Pheromone Receptors 

We hypothesized the existence of two distinct families of genes encoding pheromone 
receptor genes that are selectively colocalized with either the Go protein in the basal half of the 

25 vomeronasal neuroepithelium or with the Gija protein in the apical region. For simplicity of 
nomenclature, and with the understanding that the cosegregation of distinct G-protein subunits 
with independent families of pheromone receptors is consistent but does not demonstrate a 
functional link, the family of genes encoding putative pheromone receptors that we have 
previously identified and that colocalize with Gi 2B will be named Gi 2o -VN, whereas the novel 

30 family of receptors coexpressed with Go and described in this study will be named Go-VN. In 
the absence of information concerning the nature of the Go-VN receptor molecules, we reiterated 
the cloning strategy that allowed us to identify a family of putative pheromone receptor genes 
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expressed by Gi 2o + neurons (Dulac and Axel, Cell, 1995, 83:195-206). This strategy was based 
on the assumption that individual neurons within the VNO are likely to express only one 
pheromone receptor gene and that transcripts encoding a given receptor represent between 1% 
and 0.1% of a single-cell mRNA. Differential screening of cDNA libraries constructed from 
5 single- VNO neurons takes advantage of the fact that different cells express different receptors 
and thus provides an experimental solution to the problem of detecting a specific transcript in a 
heterogeneous population of neurons. In this attempt, we expected that differential screening of 
a cDNA library prepared from an isolated Go+, Gi 2a - VNO neuron would permit the isolation 
of a class of pheromone receptor genes distinct from the Gi 2o -VN family of receptor genes. 

1 0 A cDNA library prepared from a Go+ neuron (VN 1 3) was differentially hybridized with 

32 P-labeled probes prepared from VN13 and from a second VNO neuron cDNA (VN10). A 425 
bp cDNA (Go-VN13A) present at a frequency of 0.1% in the VN13-cDNA library showed 
selective hybridization with VN13 cell probe. Two cDNAs of longer size, Go-VN13B (SEQ ID 
NO. 49) and Go-VN13C (SEQ ID NO. 47), were subsequently isolated from a cDNA library 

15 prepared from dissected adult VNOs and showed 90% sequence similarity with Go-VN13A. 
Hybridization to VNO cross-sections with digoxigenin-labeled antisense UNA probe showed that 
expression of these transcripts is restricted to a small subpopulation of VNO neurons in a 
location consistent with the region of Go expression of the neuroepithelium. The sequence of 
Go-VN13B (SEQ ID NO. 49) reveals a partial open reading frame that includes seven 

20 hydrophobic stretches of 20 amino acids in length. Go-VN13B (SEQ ID NO. 49) sequence does 
not share any resemblance with the odorant receptor genes nor with the family of putative 
pheromone receptor genes previously identified (see below). In addition, hybridization of 
Go-VN13B DNA probe to genomic DNA identified two discrete bands at high stringency and 
13 or more at lower stringency, revealing the existence of a family of closely related genes in the 

25 rat genome. 

Taken together, these data indicate that we have isolated a novel multigene family 
encoding seven transmembrane domain receptors and expressed by subsets of VNO neurons 
from the basal half of the neuroepithelium. 

30 Sequences f a New Family of VNO Receptors 

Recombinant phages from a VNO cDNA library were screened at low stringency with 
the Go-VN13B (SEQ ID NO. 49) DNA probe. Six distinct gene subfamilies were isolated that 
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showed no cross-hybridization under stringent conditions of hybridization and washing. cDNAs 
Go-VNl to G0-VN6, each representative of a subfamily, were fully sequenced (SEQ ID Nos 33, 
35,37, 39,41 and 43). 

In Go-VNl to Go-VN5 cDNAs (SEQ ID Nos 33, 35, 37, 39 and 41), the first methionine 
5 of the open reading frame was tentatively chosen as a start for protein translation, revealing large 
open reading frames ranging from 548 to 866 amino acids. A frame shift in the G0-VN6 (SEQ 
ID NO. 44) sequence (amino acid 532; indicated by slash bar in Fig. 3) indicated that this 
transcript is unable to generate a functional protein. 

10 Deduced Amino Acid Sequences of cDNAs from the Go-VN Family of Pheromone 
Receptors 

The deduced amino acid sequences of eight cDNAs belonging to the Go-VN family of 
putative pheromone receptors is shown in Figure 3. Predicted position of seven transmembrane 
domains is also indicated (I- VII). Amino acids common to at least five cDNAs are shaded. 

15 Amino acids common to the rat mGluRl and Ca2 + -sensing receptors are indicated by a star. 

Hydropathy analysis of the predicted Go-VN proteins with the Kyte-Doolittle algorithm 
identified a large hydrophilic N-terminal domain that ranges in size from 274 amino acids in 
Go-VNl (SEQ ID NO. 34) to 595 in Go-VN4 (SEQ ID NO. 40). this is preceded in cDNAs 
Go-VN4 (SEQ ID NO. 40), Go-VN7 (SEQ ID NO. 46), and Go-VN13C (SEQ ID NO. 50) by 

20 an initial hydrophobic 21 amino acid segment characteristic of eukaryotic signal sequences. A 
cluster of seven hydrophobic regions representing potential membrane-spanning helices and 
typical of the G protein-coupled receptor superfamily is followed by a short hydrophilic sequence 
that indicates a potential intracytoplasmic C-terminal domain. A database search indicated the 
presence of sequence motifs common to Ca2 + -sensing and metabotropic glutamate (mGluR) 

25 receptors (Houamed, K., et al., Science, 1991, 252:1318-1321; Masu, M., et al., Nature, 1991, 
349:760-765; Brown, E., et al., Nature, 1993, 366:575-580 ; Pollak, M., et al., Cell, 1993 
75:1297-1303). Pairwise sequence alignments reveal 18% to 23% sequence identity between 
the rat Ca2 + -sensing receptor and the most distant (Go-VN3, SEQ ID Nos.37, 38) and the closest 
(Go-VNl, SEQ ID NOs. 33, 34) Go-VN sequences, respectively. Sequences of rat mGluRl and 

30 Go-VN cDNAs appear more distantly related. Several localized regions showed a more 
pronounced degree of similarity, including a cysteine-rich sequence just preceding the first 
transmembrane domain (amino acid 206 to 260 in Go-VNl, SEQ ID NO. 34), the predicted 
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transmembrane domains 2 to 7 with surrounding cytoplasmic and extracellular loops, and the 
relative position of 20 cysteines. The N-terminal and first transmembrane domains show little 
degree of homology. In mGluR and Ca2 + -sensing receptors, the second intracellular loop is 
involved in providing specificity for G-protein coupling (Gomeza, J., et al., 1 Biol Chem., 
5 1 996, 27 1 :2 1 99-2205), enabling different classes of mGluR receptors to activate phospholipase 
C or to inhibit adenylyl cyclase. In Go-VN, this domain is rich in basic residues, as expected for 
potential G-protein coupling, and shows closer resemblance to the class II and III mGluRs that 
were shown to couple to Go and Gi subunits. Overall, the six Go-VN sequences share between 
42% and 75% sequence identity. Regions of Go-VN proteins downstream of transmembrane 

1 0 domain 2 are nearly identical in all VNO receptor sequences. In contrast, N-terminal extracellular 
regions and first transmembrane domains are quite divergent. 

Anomalies in Go-VN cDNA Sequences: Two unusual features were observed in the 
sequence of some Go-VN cDNAs. In Go-VN 1 (SEQ ID NO. 33) and Go-VN3 (SEQ ID NO. 37) 
cDNAs, stretches of open reading frame can be found in the 5' extremity of the cDNAs that 

15 generate polypeptide sequences of 310 and and 152 amino acids, respectively, which are 
interrupted by a frameshift in Go-VN 1 and by an insertion of 500 nucleic acids in Go-VN3. The 
prospective receptor protein sequences indicated for Go-VNl (SEQ ID NO. 33) and Go-VN3 
(SEQ ID NO. 37) (Fig. 3) start at the next available methionin and are therefore significantly 
shorter than those of other receptor cDNAs. 

20 Go-VN7 (SEQ ID NO. 45) and Go-VN13C (SEQ ID NO. 47) cDNAs show a similar 

deletion of 150 bp located at the exact same position in the sequence. Strikingly, the 150 bp 
deletion does not alter the open reading frame but generates a gap that encompasses 34 amino 
acids upstream of the first transmembrane domain and most of the first transmembrane domain 
itself. 

25 Hydropathy analysis of Go-VN7 (SEQ ID NO. 46) and Go-VN13C (SEQ ID NO. 48) 

protein sequences detects only a seven to eight amino acid long hydrophobic stretch that might 
not be long enough to replace the deleted transmembrane domain 1 and allow the appropriate 
folding of the protein. Except for the 150 bp gap, sequences of Go-VN13B (SEQ ID NO. 50) and 
Go-VN 13C (SEQ ID NO. 48) are identical. This raises the question as to whether both transcripts 

30 might originate from alternative splicing of the same gene. Alternatively, they might be 
transcribed from independent genes that evolved from recent duplication and deletion events. 
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Size f the Go- VN Family of Genes 

We investigated the size of the Go-VN family of receptors by hybridizing 32 P-labeled 
cDNA probes prepared from regions spanning the most divergent N-terminal half of the receptor 
protein to rat genomic DNA. Individual probes identify two to four discrete bands under 
5 stringent conditions of hybridization and washing. Under conditions of reduced stringency, each 
of the individual probes now generates a unique pattern of 12 to 20 bands, providing a direct 
illustration of the existence of a very large family of related genes. 

A direct estimate of the size of the Go-VN receptor gene family was obtained by low 
stringency screening of a rat genomic library. PCR amplification on genomic DNA had indicated 
10 that receptor genes are devoid of introns in the region encompassing transmembrane domains 2 
to 7, enabling us to deduce directly the number of genes present in the rat genome. A mix of 
32 P-labeled DNA probes prepared from the six Go-VN cDNA fragments identified 1 10 positive 
clones per haploid genome, indicating that the family of Go-VN receptors may consist of 100 
genes. 

15 

Expression Pattern of Go-VN Receptors 

The pattern of expression of the Go-VN receptor genes was examined by in situ 
hybridization with digoxigenin-labeled RNA antisense probes. No signal was observed after 
hybridizing the mix of Go-VNl to G0-VN6 (SEQ ID NOs. 33, 35, 37, 39, 41 and 43) receptor 

20 probes to sections of muscle, testis, brain, or whole head. The adult olfactory epithelium was also 
consistently negative, although rare positive cells (one to three cells per section) were observed 
in the olfactory neuroepithelium of El 9 rat embryo. In contrast, strong signals were observed 
when antisense receptor RNA probes were hybridized to VNO neuroepithelium. In adults, each 
one of the Go-VN probes detects small subsets of VNO sensory neurons. When hybridization 

25 and washing were performed at lower temperature, the number of faintly labeled neurons 
increased, revealing cross- hybridization to more distant receptor genes. 

Under high stringency conditions, cDNA clones Go-VNl to G0-VN6 label 1.9%, 3.6%, 
6.1%, 0.4%, 3.5%, and 1.3% of the VNO sensory neurons, respectively. Under the same 
experimental conditions, the mix of all six Go-VN RNA probes labels 19% of the cells. This 

30 number is similar to the sum of labeled neurons detected with the six individual Go-VN probes 
(17%), indicating that probes representing the six receptor subfamilies recognize distinct 
populations of VNO sensory neurons. Spatial Distribution of Go-VN Receptor Transcripts 
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Positive neurons identified with each of the Go-VN probes were randomly distributed along the 
anteroposterior and dorso-ventral axis of the VNO neuroepithelium. Most RNA probes recognize 
cells that are preferentially localized in the most basal two-thirds of the neuroepithelium 
corresponding to the zone of Go expression. However, careful examination of adjacent 
5 cross-sections of vomeronasal neuroepithelium labeled with each of the Go-VN probes reveals 
a well-organized spatial distribution of receptor expression. Different receptors appear 
preferentially localized in radial zones that define a series of hemiconcentric rings of distinct 
diameters. This pattern is observed along the entire length of the VNO and is conserved in all 
animals analyzed. The Go-VN3 (SEQ ID NO. 37) probe, for example, recognizes a subset of 

10 neurons that are confined to the most basal third of the VNO neuroepithelium. In contrast, the 
Go-VNl (SEQ ID NO. 33), Go-VN4 (SEQ ID NO. 39), and Go-VN5 (SEQ ID NO. 41) RNA 
probes identify cells restricted to a hemiconcentric zone immediately apical to the area of 
Go-VN3 expression, whereas Go-VN2 identifies cells apposed to the apical layer of supporting 
cells. G0-VN6 in turn is found only in sparse cells immediately apposed to the basal membrane. 

15 This is best seen in a statistical representation of Go-VN receptor localization collected from 
VNO sections and multiple animals that shows a striking conservation of these patterns. Thus, 
transcription of Go-VN cDNAs appears restricted to one of three circumscribed areas of the VNO 
neuroepithelium in a manner quite reminiscent of the odorant receptor gene expression in four 
zones of the MOE (Ressler, K., et al., Cell, 1993, 73:597-609 ; Vassar, R., et al., Cell, 1993, 

20 74:309-3 1 8). Although Go-VN3 (SEQ ID NO. 37) and G0-VN6 (SEQ ID NO. 43) transcripts 
show a clear segregation in the most basal region of the VNO neuroepithelium, the sequence 
anomalies found in both transcripts leave the functionality of this area of the neuroepithelium as 
an open question. 

25 Sexual Dimorphism in Receptor Distribution and Age-Related Changes 

To identify potential sexual dimorphism in Go-VN receptor expression, we systematically 
hybridized each probe to sections originating from adult male and female ratVNOs. All receptors 
were equally distributed in males and females with the striking exception of Go-VN2 (SEQ ID 
NO. 35). In females, Go-VN2 appears expressed in a large and centrally located region 
30 comprising one-third of the neuroepithelium. In sharp contrast, the same probe recognizes in 
males a cohort of cells in the most apical side of the neuroepithelium, closely apposed to the 
VNO lumen, and most likely intermingled with Gi^ VNO sensory neurons. Such a difference 
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in the Go-VN2 expression pattern in males and females might result from the expression of the 
same receptor gene in a different zone of the VNO epithelium or from a differential expression 
of two distinct but closely related genes of the Go-VN2 subfamily. In females, Go-VN2 
generates a very intense hybridization signal to most positive neurons and a fainter staining on 
5 a second set of labeled cells. The population of faintly labeled cells was never detected in males, 
indicating the existence of a female-specific neuronal subpopulation expressing either a lower 
level of the Go-VN2 transcript or a female-specific receptor significantly different but still 
cross-hybridizing to the Go-VN2 probe. We followed the emergence of receptor expression and 
of the VNO zonal organization during development and postnatal stages preceding puberty. 

10 Go-VN receptor expression is first detected in the VNO of El 4 embryos. No significant 
difference is observed in the onset of expression of Gi 2a -VN and Go-VN classes of receptor 
genes. In agreement with data of Berghard and Buck, 1996 in mouse, segregation of Gi 2B and 
Go expression in the apical and basal areas of VNO neuroepithelium, respectively, is not 
apparent in the embryo and in 1 -week-old animals. In contrast, Gi^* cells appear randomly 

15 distributed in large clusters over the whole thickness of the neuroepithelium, intermingled with 
Go cells. At 4 weeks after birth, however, Gi 2o cells appear clearly localized in the apex of the 
epithelium. Similarly, in situ hybridization experiments with mixes of Go-VN and Gi^-VN 
receptor probes on sections of the VNOs dissected from late embryos and 1 -week-old animals 
show that the two cell populations are still intermingled at early postnatal stages. We observed 

20 that the zonal distribution of the two families of receptors slowly emerges during sexual 
maturation to reach the spatial distribution observed in adults. Preliminary data indicate that the 
sexual dimorphic expression pattern of Go-VN2 is undetectable at 6 weeks after birth. Thus, in 
contrast to the zones of olfactory receptor gene expression, which are already present in the 
olfactory epithelium at the earliest stages of receptor gene expression in the embryo (Sullivan, 

25 S., et al., Neuron, 1995, 15:779-789), the spatial organization of the VNO neuroepithelium as 
detected by G-protein and receptor gene expression emerges only in a late postnatal period and 
reaches its definitive pattern at sexual maturity. 

Expression of Go-VN Receptors Is Restricted to Go+ VNO Neurons 

30 The expression of some of the Go-VN receptors in neurons lining the VNO lumen in an 

area mainly occupied by Gi^* cells raises the obvious question as to whether the expression of 
this family of genes is strictly restricted to Go+ VNO neurons. Single-cell cDNA prepared from 
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23 individual VNO neurons was analyzed by Southern blots with probes representing the six 
divergent subfamilies of Go-VN receptors and was PCR amplified with degenerated primers 
based on conserved motifs between Go-VN receptor sequences. Both approaches confirmed that 
none of the 19 cell cDNAs prepared from Gi 2a + neurons contained any sequence of the Go-VN 

5 receptor family. In contrast, all four cDNAs generated from Gi 2 «- cells contained a sequence 
related to the Go-VN receptors. PCR products generated with degenerated primers based on 
conserved motifs between Go-VN receptor sequences and obtained from the four Go+ cells were 
subcloned and sequenced. For each single-cell cDNA, the insert sequences from ten independent 
colonies were found to be identical. This set of data strongly suggests that Go-VN receptor 

10 genes are not expressed by Gi 2a + neurons and constitutes preliminary evidence for the expression 
of only one Go-VN receptor gene per neuron. 

Those skilled in the art will recognize, or be able to ascertain using no more than routine 
experimentation, many equivalents to the specific embodiments of the invention described 
herein. Such equivalents are intended to be encompassed by the following claims. All references 

1 5 disclosed herein are incorporated by reference in their entirety. 

A Sequence Listing is presented below and is followed by what is claimed. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION 
(i) APPLICANT: PRESIDENT AND FELLOWS OF HARVARD COLLEGE 

(ii) TITLE OF THE INVENTION: NOVEL PHEROMONE RECEPTORS 

(iii) NUMBER OF SEQUENCES: 92 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Wolf, Greenfield & Sacks, P.C. 

(B) STREET: 600 Atlantic Avenue 

(C) CITY: Boston 

(D) STATE: MA 

(E) COUNTRY: U.S.A. 

(F) ZIP : 02210-2211 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: DOS 

(D) SOFTWARE: FastSEQ for Windows Version 2.0 

(vi) CURRENT APPLICATION DATA: 
(A) APPLICATION NUMBER: 

<B) FILING DATE: 
(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 60/051,284 

(B) FILING DATE: 30-JUN-1997 



(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Plumer, Elizabeth R. 

(B) REGISTRATION NUMBER: 36,637 

(C) REFERENCE/DOCKET NUMBER: H0498/7074 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 617-720-3500 

(B) TELEFAX: 617-720-2441 

(C) TELEX: 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3080 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(ix) FEATURE: 

(A) NAME /KEY : Coding Sequence 

(B) LOCATION: 57... 2606 
(D) OTHER INFORMATION: VR1 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
GTTTTTCTGC ATCAGAAACG GATTTCACAG CAGCTCCATC TCAGATCCTA GCAGAC AT,G 
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Met 
l 

AAG GAG CTC TGC GCT TTC ACT ATT TCT TTG TTG TTT CTG AAG TTT TCT 107 
Lys Gin Leu Cys Ala Phe Thr lie Ser Leu Leu Phe Leu Lys Phe Ser 
5 10 15 

CTC ATC CTG TGC TGT TTG ACT GAA CCA AGT TGC TTT TGG AGA ATA AGG 155 
Leu lie Leu Cys Cys Leu Thr Glu Pro Ser Cys Phe Trp Arg lie Arg 
20 25 30 

AAT AGT GAA GAT AGT GAT GGA GAT TTA CAA AGG GAA TGT CAT TTT TAC 203 
Asn Ser Glu Asp Ser Asp Gly Asp Leu Gin Arg Glu Cys His Phe Tyr 
35 40 45 

CTT TGG AAA ACT GAT GAA CCT ATT GAA GAT AGT TTT TAT AAT TAT GAT 251 
Leu Trp Lys Thr Asp Glu Pro lie Glu Asp Ser Phe Tyr Asn Tyr Asp 
50 55 60 65 

TTA AGT TTT AGA ATT GCA GCA AGT GAA TAT GAG TTT CTT CTC GTA ATG 233 
Leu Ser Phe Arg lie Ala Ala Ser Glu Tyr Glu Phe Leu Leu Val Met 
70 75 80 

TTT TTT GCT ATC GAT GAG ATC AAC AGG AAT CCT TAT CTT TTA CCC AAC 347 
Phe Phe Ala lie Asp Glu lie Asn Arg Asn Pro Tyr Leu Leu Pro Asn 
85 90 95 

ATA ACT TTG ATG TTC TCC TTC ATT GGT GGA AAC TGT CAG GAT TTA TTG 395 
lie Thr Leu Met Phe Ser Phe lie Gly Gly Asn Cys Gin Asp Leu Leu 
100 105 no 

AGA GTT ATG GAC CAA GCA TAT ACA CAA ATA AAT GGA CAT ATG AAT TTT 443 
Arg Val Met Asp Gin Ala Tyr Thr Gin He Asn Gly His Met Asn Phe 
115 120 125 

GTT AAT TAT TTC TGT TAT TTA GAT GAT TCA TGT GCC ATA GGT CTT ACA 491 
Val Asn Tyr Phe Cys Tyr Leu Asp Asp Ser Cys Ala He Gly Leu Thr 
130 135 140 145 

GGA CCA TCA TGG AAA ACT TCC TTA AAA CTG GCA ATG CAC TCT TCG ATG 53 9 

Gly Pro Ser Trp Lys Thr Ser Leu Lys Leu Ala Met His Ser Ser Met 
150 155 160 

CCA CTG GTT TTC TTT GGA CCA TTT AAT CCT AAC CTA CGC GAC CAT GAC 58 7 

Pro Leu Val Phe Phe Gly Pro Phe Asn Pro Asn Leu Arg Asp His Asp 
165 170 175 

CGG CTG CCC CAT GTC CAT CAG GTA GCC CCC AAG GAC ACA CAT TTG TCC 635 
Arg Leu Pro His Val His Gin Val Ala Pro Lys Asp Thr His Leu Ser 
180 185 190 

CAT GGC ATG GTC TCC TTG ATG TTT CAC TTT AGA TGG ACT TGG ATA GGA 683 
His Gly Met Val Ser Leu Met Phe His Phe Arg Trp Thr Trp He Gly 
195 200 205 

CTG GTC ATC TCA GAT GAT GAC CAG GGT ATT CAG TTT CTC TCA GAT TTA 731 
Leu Val He Ser Asp Asp Asp Gin Gly He Gin Phe Leu Ser Asp Leu 
210 215 220 225 

AGA GAA GAA AGC CAA AGG CAT GGG ATC TGT TTA GCT TTT GTT AAT ATG 779 
Arg Glu Glu Ser Gin Arg His Gly He Cys Leu Ala Phe Val Asn Met 
230 235 240 



ATC CCA GAA AAC ATG CAG 
He Pro Glu Asn Met Gin 



ATA 
He 



TAC ATG ACA AGG GCT ACA ATA TAT GAT 
Tyr Met Thr Arg Ala Thr He Tyr Asp 
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245 250 255 

AAA CAC ATT ATG ACA TCT TCA GCA AAG GTT GTT ATC ATT TAT GGT GAA 875 
Lys His lie Met Thr Ser Ser Ala Lys Val Val lie He Tyr Gly Glu 
260 265 270 

ATG AAC TCT ACT CTA GAA GCA AGC TTT AGA AGA TGG GAA GAG TTA GGT 923 
Met Asn Ser Thr Leu Glu Ala Ser Phe Arg Arg Trp Glu Glu Leu Gly 
275 280 285 

GCT CGG AGA ATC TGG ATC ACA ACC TCA GAA TGG GAT GTC ATC ACA AAT 971 
Ala Arg Arg He Trp He Thr Thr Ser Gin Trp Asp Val He Thr Asn 
290 295 300 305 

AAA AAA GAC TTC ACC CTT AAT CTC TTC CAT GGG ATC ATC ACT TTT GAA 1019 
Lys Lys Asp Phe Thr Leu Asn Leu Phe His Gly He He Thr Phe Glu 
310 315 320 

CAT CAT AGA TTT GAG ATT CCT AAA TTA AAT AAA TTC ATG CAA ACA ATG 1067 
His His Arg Phe Glu He Pro Lys Leu Asn Lys Phe Met Gin Thr Met 
325 330 335 

AAC ACT GCC AAA TAC CCA GTA GAT ATT TCT CAT ACT ATA TTG GAG TGG 1115 
Asn Thr Ala Lys Tyr Pro Val Asp He Ser His Thr He Leu Glu Trp 
340 345 350 

AAT TAT TTT AAT TGT TCA ATA TCT AAG AAC AGC ATT AGA ATG CAT CAT 1163 
Asn Tyr Phe Asn Cys Ser He Ser Lys Asn Ser He Arg Met His His 
355 360 365 

ATT ACA TTC AAC AAC ACC TTG GAA TGG ACA TCA CTG CAC AAC TAT GAT 1211 
He Thr Phe Asn Asn Thr Leu Glu Trp Thr Ser Leu His Asn Tyr Asp 
370 375 380 385 

GTG GCG ATG AGT GAT GAA GGT TAC AAT TTG TAC AAT GCT GTT TAT GCT 1259 
Val Ala Met Ser Asp Glu Gly Tyr Asn Leu Tyr Asn Ala Val Tyr Ala 
390 395 400 

GTG GCC CAC ACC TAC CAT GAA TAC ATT TTT CAA CAA GTA GAG TCT CAG 1307 
Val Ala His Thr Tyr His Glu Tyr He Phe Gin Gin Val Glu Ser Gin 
405 410 415 

AAA AAG GCA AAA CCC AAA AGA TAT TTC ACT GCT TGT CAG CAG GTG TCT 1355 
Lys Lys Ala Lys Pro Lys Arg Tyr Phe Thr Ala Cys Gin Gin Val Ser 
420 425 430 

TCC TTG ATG AAA ACC AGG GTA TTT ACG AAC CCT GTT GGA GAA CTG GTG 1403 
Ser Leu Met Lys Thr Arg Val Phe Thr Asn Pro Val Gly Glu Leu Val 
435 440 445 

AAC ATG AAG CAT AGG GAA AAT CAG TGT ACA GAG TAT GAT ATT TTC ATC 1451 
Asn Met Lys His Arg Glu Asn Gin Cys Thr Glu Tyr Asp He Phe He 
450 455 460 465 

ATT TGG AAT TTT CCA CAA GGC CTT GGA TTA AAA GTG AAA ATA GGA AGC 1499 
He Trp Asn Phe Pro Gin Gly Leu Gly Leu Lys Val Lys He Gly Ser 
470 475 480 

TAT TTA CCT TGT TTT CCA CAG AGA CAA AAA CTT CAT ATA TCT GAT GAT 1547 
Tyr Leu Pro Cys Phe Pro Gin Arg Gin Lys Leu His He Ser Asp Asp 
485 490 495 

TTG GAA TGG GCC AAG GGA GGA ACA TCA CCT CAG GTT CCC TCC TCC GTG 1595 
Leu Glu Trp Ala Lys Gly Gly Thr Ser Pro Gin Val Pro Ser Ser Val 
500 505 510 
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TGT AGT GTG GCA TGT ACT GCT GGA TTC AGG AAA ATT TAT CAA AAA GAA 1643 
Cys Ser Val Ala Cys Thr Ala Gly Phe Arg Lys He Tyr Gin Lys Glu 
515 520 525 

ACA GCA GAC TGC TGC TTT GAT TGT GTT CAG TGC CCA GAA AAT GAG ATT 1691 
Thr Ala Asp Cys Cys Phe Asp Cys Val Gin Cys Pro Glu Asn Glu He 
530 535 540 545 

TCC AAC GAA ACA GAT ATG GAA CAG TGT GTG AGG TGT CCA GAT GAT AAG 1739 
Ser Asn Glu Thr Asp Met Glu Gin Cys Val Arg Cys Pro Asp Asp Lys 
550 555 560 

TAT GCC AAC ATA GAG CAA ACC CAC TGC CTC TCA AGA GCT GTA TCA TTT 1787 
Tyr Ala Asn He Glu Gin Thr His Cys Leu Ser Arg Ala Val Ser Phe 
565 570 575 

CTG GCT TAT GAA GAT TCA TTG GGG ATG GCT CTA GGC TGC ATG GCA CTG 1835 
Leu Ala Tyr Glu Asp Ser Leu Gly Met Ala Leu Gly Cys Met Ala Leu 
580 585 590 

TCC TTC TCA GCC ATC ACA ATT CTA ATC CTC GTC ACA TTT GTG AAG TAC 1883 
Ser Phe Ser Ala He Thr He Leu He Leu Val Thr Phe Val Lys Tyr 
595 600 605 

AAA GAT ACT CCC ACT GTG AAG GCC AAT AAC CGC ATT CTC AGC TAC ATC 1931 
Lys Asp Thr Pro Thr Val Lys Ala Asn Asn Arg He Leu Ser Tyr He 
610 615 620 625 

CTG CTC ATC TCT CTC GTC TTC TGC TTT CTC TGC TCC CTG CTC TTC ATT 1979 
Leu Leu He Ser Leu Val Phe Cys Phe Leu Cys Ser Leu Leu Phe He 
630 635 640 

GGA CCT CCC GAC CAG GTC ACC TGC ATC TTT CAG CAG ACC ACA TTT GGA 2 02 7 
Gly Pro Pro Asp Gin Val Thr Cys He Phe Gin Gin Thr Thr Phe Gly 
645 650 655 

GTA TTG TTC ACT GTG TCT GTT TCT ACA GTG TTG GCC AAA ACA ATA ACT 2075 
Val Leu Phe Thr Val Ser Val Ser Thr Val Leu Ala Lys Thr He Thr 
660 665 670 

GTG GTC ATG GCT TTC AAG CTC ACT ACT CCA GGA AGA AGG ATG AGA GGG 2123 
Val Val Met Ala Phe Lys Leu Thr Thr Pro Gly Arg Arg Met Arg Gly 
€75 680 685 

ATG ATG ATG ACA GGG GCA CCT AAG TTG GTC ATT CCC ATT TGT ACC CTG 2171 
Met Met Met Thr Gly Ala Pro Lys Leu Val He Pro He Cys Thr Leu 
690 695 700 705 

ATC CAA CTT GTT CTC TGT GGA ATC TGG TTG GTC ACA TCT CCT CCC TTT 2219 
He Gin Leu Val Leu Cys Gly He Trp Leu Val Thr Ser Pro Pro Phe 
710 715 720 

ATT GAC AGA GAC ATA CAA TCT GAG CAT GGG AAG ATT GTC ATT CTT TGC 2267 
He Asp Arg Asp He Gin Ser Glu His Gly Lys He Val He Leu Cys 
725 730 735 

AAT AAA GGC TCA GTC ATT GCC TTC CAC GTC GTC CTG GGA TAC TTG GGC 2315 
Asn Lys Gly Ser Val He Ala Phe His Val Val Leu Gly Tyr Leu Gly 
740 745 750 

TCC TTG GCT CTG GGG AGC TTC ACG TTG GCT TTC CTG GCT AGG AAC CTT 2363 
Ser Leu Ala Leu Gly Ser Phe Thr Leu Ala Phe Leu Ala Arg Asn Leu 
755 760 765 



CCT GAC ACA TTC AAT GAA GCC AAG TTC CTA ACT TTC AGC ATG CTG GTG 
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Pro Asp Thr Phe Asn Glu Ala Lys Phe Leu Thr Phe Ser Met Leu Val 
770 775 780 785 

TTC TGC AGT GTC TGG ATC ACC TTC CTC CCT GTC TAC CAC AGC ACC AGG 2459 
Phe Cys Ser Val Trp lie Thr Phe Leu Pro Val Tyr His Ser Thr Arg 
790 795 800 

GGG AGG GTC ATG GTG GTT GTG GAG GTT TTC TCC ATC TTG GCT TCT AGT 2507 
Gly Arg Val Met Val Val Val Glu Val Phe Ser lie Leu Ala Ser Ser 
805 610 815 

GCA GGG TTG CTA ATG TGT ATC TTT GTC CCA AAG TGT TAT GTT ATT TTA 2555 
Ala Gly Leu Leu Met Cys He Phe Val Pro Lys Cys Tyr Val He Leu 
820 825 830 

ATT AGA CCA GAT TCA AAT TTT ATA AAG AAC CAC AAA GGT AAA TTG CTT 2603 
He Arg Pro Asp Ser Asn Phe He Lys Asn His Lys Gly Lys Leu Leu 
835 840 845 

TAT TGAAACTTTC ATGGTATGAA AATGTTAGAT GATATTCAAC TTATCTTATT CTTCAT 2662 

Tyr 

850 

CTTAATAAAA GCAGTACTTC ATCATATAAA AAATAAAGTA ATATACAGAT TTATACTTAC 2722 

AAACTGGACA GCAAACATGA ATATGTTGAG AACTGGGATT CTCAATTGAG GAATGGCTAC 2782 

CAATATTTTG ATCTGTGGTT TTGTGTTTAA GCCATGTACT TAATTAATGA TTAATATGAG 2842 

GTTACCCTAC TGTCTTTGAA CAGCGCCACC TCTAGGCATG CTGTCCTTGA GTTATAAGAA 2 902 

AGGGTACTGC ATACACAATG GACATGAAGC CAGTAATCAA CATTATTCCA CTTGCTTTCA 2962 

TGGAGTTCTT ACATCCAAGT TCATGCCTTG ACTTTATTCA ATGTTCTATG ACAAAGGTAG 3022 

ATAAATAAAT AAACACTTTC CTCGTCGACG CGGCCGCGTC GACGTCGACG CGGCCGCG 3 080 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 850 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 



Met 


Lys 


Gin 


Leu 


Cys 


Ala 


Phe 


Thr 


He 


Ser 


Leu 


Leu 


Phe 


Leu 


Lys 


Phe 


1 








5 










10 










15 




Ser 


Leu 


He 


Leu 


Cys 


Cys 


Leu 


Thr 


Glu 


Pro 


Ser 


Cys 


Phe 


Tr P 


Arg 


He 








20 










25 










30 






Arg 


Asn 


Ser 


Glu 


Asp 


Ser 


Asp 


Gly 


Asp 


Leu 


Gin 


Arg 


Glu 


Cys 


His 


Phe 






35 










40 










45 








Tyr 


Leu 


Trp 


Lys 


Thr 


Asp 


Glu 


Pro 


He 


Glu 


Asp 


Ser 


Phe 


Tyr 


Asn 


Tyr 




50 










55 










60 










Asp 


Leu 


Ser 


Phe 


Arg 


He 


Ala 


Ala 


Ser 


Glu 


Tyr 


Glu 


Phe 


Leu 


Leu 


Val 


65 










70 










75 










80 


Met 


Phe 


Phe 


Ala 


He 


Asp 


Glu 


He 


Asn 


Arg 


Asn 


Pro 


Tyr 


Leu 


Leu 


Pro 










85 










90 










95 




Asn 


He 


Thr 


Leu 


Met 


Phe 


Ser 


Phe 


He 


Gly 


Gly 


Asn 


Cys 


Gin 


Asp 


Leu 








100 










105 










110 






Leu 


Arg 


Val 


Met 


Asp 


Gin 


Ala 


Tyr 


Thr 


Gin 


He 


Asn 


Gly 


His 


Met 


Asn 






115 










120 










125 








Phe 


Val 


Asn 


Tyr 


Phe 


Cys 


Tyr 


Leu 


Asp 


Asp 


Ser 


Cys 


Ala 


He 


Gly 


Leu 




130 










135 










140 










Thr 


Gly 


Pro 


Ser 


Trp 


Lys 


Thr 


Ser 


Leu 


Lys 


Leu 


Ala 


Met 


His 


Ser 


Ser 


145 










150 










155 










160 


Met 


Pro 


Leu 


Val 


Phe 


Phe 


Gly 


Pro 


Phe 


Asn 


Pro 


Asn 


Leu 


Arg 


Asp 


His 



WO 99/00422 



PCT/US98/13680 



-67- 











165 










inn 

X / u 










1 "7 C 
1 / 3 




Asp 


Arcr 


Leu 


Pro 


His 


Val 


His 


Gin 


Val 


Ala 


pro 


Lys 


Asp 


TVir 
X XXX 


nis 


Leu 








T fl o 










IOC 

XOj 










ion 






Ser 


His 


Glv 


Met 


Val 


Ser 


Leu 


Met 


Phe 


His 


Phe 


Arcr 
Arg 


Trp 


TVir 

X XXX 


Trp 


Tl o 

lie 






195 


















one 






Glv 


Leu 


val 


He 


Ser 


Asp 


Asp 


Asp 


Gin 


Gly 






Olio 


Leu 


Ser 


Asp 




210 










215 










2 "5 ft 








Leu 


Ax ci 


Glu 


Glu 


Ser 


Gin 




His 


Glv 


He 


Cys 


Leu 


1\1 a 


Xrxiti 


Val 


Asn 


225 










230 




















240 


Met 


He 


Pro 


Glu 


Asn 


Met 


Gin 


He 


Tvr 

i y t 


Met 


TVir 

X XXX 


Arg 


Ala 


X XIX 


Tl A 

lie 


rn.no 

Tyr 










245 










95ft 










ice 
233 


Asp 


Lvs 


His 


He 


Met 


Thr 


Ser 


■ Ser 


Ala 


T.vfl 


Val 

VAX 


Val 
val 


Tl Ck 
lie 


He 


Tyr 


Giy 








" w V/ 










AOS 










270 


Glu 


Met 


Asn 


Ser 


Thr 


Leu 


Glu 


TV "1 a 
J"LXd 




XrlXc 


Arg 


Arg 


Trp 


ulU 


GlU 


Leu 






275 










2 on 










285 








Glv 


Ala 


Arcr 




He 


Tm 


He 


Thr 


Thr 


Ser 


vjrXli 


Trp 


Asp 


vai 


lie 


lux 




290 




















300 








Asn 


Lys 


Lys 


Asp 


Phe 


Thr 


Leu 


Asn 


Leu 


.r XXC 




\jiy 


Tl A 

lie 


Tl A 

lie 


Ixlx 


rue 


305 










310 










11 5 












Glu 


His 


His 


Aro 

m y 


Phe 


Glu 


He 


Pro 


Lvs 


Leu 


Asn 


i-*ys 


file 


M^r 


bin 


TVi v- 










325 

«9 « 3 










H ft 










^ T c 

335 




Met 


Asn 


Thr 


Ala 


Lys 


Tvr 


Pro 


Val 


Asp 


Tl ~ 
XXC 


ser 


nis 


inr 


He 


Leu 


pi,, 

VjIU 








340 










J43 










•sen 
350 








As xi 




Phe 


Asn 


v -y a 


Ser 


He 


Ser 


Xjys 


Asn 


Ser 


T 1 a 

lie 


Arg 




HIS 






355 










ISO 










365 






His 


He 


Thr 


Phe 


Asn 


Asn 


Thr 


Leu 


Glu 
uxu 


Trn 

xrp 


TVir 


OCX 


Leu 


Xlx S 


Asn 


Tyr 




3 70 










J / 3 










ion 








Asp 


Val 


Ala 


Met 


Ser 


Asp 


Glu 


Glv 


Tvr 


noil 


xjc; u. 


Tyr 


Asn 


IV 1 a 
Aid 


val 


Tyr 


385 






























400 


Ala 


Val 


Ala 


His 


Thr 




His 


Glu 


Tvr 


Tip 

xxc 


trxxe 




Vjrllx 


Val 


ulU 


Cny 
SCX 










405 










A1 ft 










>i i e 
4lD 




Gin 


Lys 


Lys 


Ala 


Lys 


Pro 


Lys 




Tvr 
xyx 


Phe 


Thr 


1\1 a 


Cys 


uxn 


urllx 


Val 
Veil 








420 










4,2 5 
4^3 










y« -j n 






Ser 


Ser 


Leu 


Met 


Lys 


Thr 




Val 


Phe 


Thr 


lain 


Pro 


Val 

V ciX 


rtl V 


VjIU 


Leu 






435 










a a ft 










/ c 






Val 


Asn 


Met 


Lys 


His 


Arcr 


Glu 


Asn 


Gin 


Cvs 


Thr 


Glu 


Tvr 


Asp 


lie 


Phe 




450 










455 










4£ft 










lie 


He 


Trn 


Asn 


Phe 


Pro 


Gin 


Glv 


Leu 


Glv 


Leu 


Lys 


Val 


L*ys 


Tl #» 


fll V 

wxy 


465 










470 










A "7 R 
4 / 3 










a o n 
4o U 


Ser 


Tvr 


Leu 


Pro 


Cys 


Phe 


Pro 


Gin 




Gin 


Lys 


Leu 


ilXo 


Tl f» 
xxc 


otsx 


Asp 










485 










4 Qft 










4 73 




Asp 


Leu 


Glu 


Trp Ala 


Lys 


OX 


VJX jr 


TVir 


Ser 


rxu 


uxn 


Va 1 

vai 


Fro 


Ser 


Ser 








500 










5ft 5 










31U 






Val 


Cys 


Ser 


Val 


Ala 


Cys 


Thr 


Ala 


Glv 


Phe 


x\rrr 
*ixcj 


Lys 


Tl <=» 
lie 


Tyr 


m n 


Lys 






515 




















COR 
3 A 3 








Glu 


Thr 


Ala 


Asp 


Cys 


Cys 


Phe 


Asp 


uys 


vclX 


OXll 


cys 


Pro 




Asn 


r>i ii 




530 










535 










54 ft 










lie 


Ser 


Asn 


Glu 


Thr 


Asp 


Met 


Glu 


Gin 


fVJ3 


Val 


l\rfr 

*irg 


fS/H 

t-ys 


rx V 


Asp 


Asp 


545 










3 3 \J 










5 5 5 
3 3 3 










CiC ft 
30 U 


Lvs 


Tvr 


Ala 


Asn 


He 


Glu 


Gin 


Thr 


His 


t-ys 


XjCU 


Cor 

OCX 


Arg 


IV 1 a 
Hid 


Val 
VaX 


OBX 










565 










57ft 

3 / U 










3/3 




Phe 


Leu 


Ala 


Tyr 


Glu 


Asp 


Ser 


Leu 


Glv 


Met 


Ala 


Leu 


Glv 
\arxy 


Ova 


Met 


Ala 








580 










585 










con 






Leu 


Ser 


Phe 


Ser 


Ala 


He 


Thr 


He 


Leu 


He 


Leu 


val 


Thr 


Phe 


Val 


Lys 






595 










600 










605 






Tyr 


Lys 


Asp 


Thr 


Pro 


Thr 


Val 


Lys 


Ala 


Asn 


Asn 


Arg 


He 


Leu 


Ser 


Tyr 




610 










615 










620 










He 


Leu 


Leu 


He 


Ser 


Leu 


Val 


Phe 


Cys 


Phe 


Leu 


Cys 


Ser 


Leu 


Leu 


Phe 


625 










630 










635 










640 


He 


Gly 


Pro 


Pro 


Asp 


Gin 


Val 


Thr 


Cys 


He 


Phe 


Gin 


Gin 


Thr 


Thr 


Phe 










645 










650 










655 




Gly 


Val 


Leu 


Phe 


Thr 


Val 


Ser 


Val 


Ser 


Thr 


Val 


Leu 


Ala 


Lys 


Thr 


He 








660 










665 










670 






Thr 


Val 


Val 


Met 


Ala 


Phe 


Lys 


Leu 


Thr 


Thr 


Pro 


Gly 


Arg 


Arg 


Met 


Arg 






675 










680 










685 
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Glv 


Met 
690 


Met 


Met 


Thr 


Glv 


Ala 
695 


Pro 


Lys 


Leu 


Val 


He 
700 


Pro 


He 




Thr 


Leu 


He 


Gin 


Leu 


Val 


Leu 


Cys 


Gly 


He 


Trp 


Leu 


Val 


Thr 


Ser 


Pro 


Pro 












710 










/ J. 3 












Phe 


He 


Asp 


Arg 


Asp 
725 


He 


Gin 


Ser 


Glu 


His 
730 


Gly 


Lys 


He 


val 


He 

/ O O 


Leu 


Cys 


Asn 


Lys 


Gly 
740 


Ser 


Val 


He 


Ala 


Phe 
745 


His 


Val 


Val 


Leu 


Gly 
750 


Tyr 


Leu 


Gly 


Ser 


Leu 


Ala 


Leu 


Gly 


Ser 


Phe 


Thr 


Leu 


Ala 


Phe 


Leu 


Ala 


Arg 


Asn 






755 










760 










765 






Leu 


Pro 
770 


Asp 


Thr 


Phe 


Asn 


Glu 
775 


Ala 


Lys 


Phe 


Leu 


Thr 
780 


Phe 


Ser 


Met 


Leu 


Val 


Phe 


Cys 


Ser 


Val 


Trp 


He 


Thr 


Phe 


Leu 


Pro 


Val 


Tyr 


His 


Ser 


Thr 


785 










790 










795 










800 


Arg 


Gly 


Arg 


val 


Met 
805 


Val 


Val 


Val 


Glu 


Val 
810 


Phe 


Ser 


He 


Leu 


Ala 
815 


Ser 


Ser 


Ala 


Gly 


Leu 
820 


Leu 


Met 


Cys 


He 


Phe 
825 


Val 


Pro 


Lys 


Cys 


Tyr 
830 


Val 


He 


Leu 


He 


Arg 


Pro 


Asp 


Ser 


Asn 


Phe 


He 


Lys 


Asn 


His 


Lys 


Gly 


Lys 


Leu 



835 840 845 



Leu Tyr 
850 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2961 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
( ix) FEATURE : 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 86... 2509 
(D) OTHER INFORMATION: VR2 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 

AGACACATCG GTGCAACTGT GTGTGTGATG TTTTTCTGCA TCAGAAACGG ATTTCACAGC 60 
AGCTCCATCT CAGATCCTAG CAGAC ATG AAG CAG CTC TGC ACT TTC ACT ATT 112 

Met Lys Gin Leu Cys Thr Phe Thr He 
1 5 

TCA TTG TTG TTT CTG AAG TTT TCT CTC ATC TTG TGC TGT TGG AGT GAA 160 
Ser Leu Leu Phe Leu Lys Phe Ser Leu He Leu Cys Cys Trp Ser Glu 
10 15 20 25 

CCA AGC TGC TTT TGG AGG ATA AAG AAG AGT GAA GAT AAT GAT GGA GAT 208 
Pro Ser Cys Phe Trp Arg He Lys Lys Ser Glu Asp Asn Asp Gly Asp 
30 35 40 

TTA CAA AGG GAG TGT CAT TTT TAC CTT TGG AAA ACT GAT GAA CCT ATT 256 
Leu Gin Arg Glu Cys His Phe Tyr Leu Trp Lys Thr Asp Glu Pro He 
45 50 55 

GAA GAT AGT TTT TAT AAT TAT GAT TTA AGT TTT AGA ATT GCA GGA AGT • 304 
Glu Asp Ser Phe Tyr Asn Tyr Asp Leu Ser Phe Arg He Ala Gly Ser 
60 65 70 



GAA TAT GAG CTT CTT CTG GTA ATG TTT TTT GCT ACT GAT GAG ATC AAC 
Glu Tyr Glu Leu Leu Leu Val Met Phe Phe Ala Thr Asp Glu He Asn 
75 80 85 



352 



WO 99/00422 PCT/US98/13680 

-69- 

AAG AAT COT TAT CTT TTA CCC AAC ATG AGT TTG ATG TTC TCC ATC ATT 400 
Lys Asn Pro Tyr Leu Leu Pro Asn Met Ser Leu Met Phe Ser lie He 
90 95 100 105 

GGT GGA AAC TGT CAT GAT TTA TTG AGA AGT CTG GAT CAA GAA TAT GCA 448 
Gly Gly Asn Cys His Asp Leu Leu Arg Ser Leu Asp Gin Glu Tyr Ala 
110 115 120 

CAA ATA GAT GGA CAT ATG AAT TTT GTT AAT TAT TTC TGT TAT TTA GAT 496 
Gin He Asp Gly His Met Asn Phe Val Asn Tyr Phe Cys Tyr Leu Asp 
125 130 135 

GAT TCA TGT GCC ACA GGC CTT ACA GGA CCA TCA TGG AAA ACA TCC TTA 544 
Asp Ser Cys Ala Thr Gly Leu Thr Gly Pro Ser Trp Lys Thr Ser Leu 
140 145 150 

AAA CTG GCA ATG CAT TCT TCA ATG CCA CTG GTT TTC TTT GGA CCA TTT 592 
Lys Leu Ala Met His Ser Ser Met Pro Leu Val Phe Phe Gly Pro Phe 
155 160 165 

AAT CCT AAC CTA CGC GAC CAT GAC CGG CTG CCC CAT GTC CAT CAG GTA 640 
Asn Pro Asn Leu Arg Asp His Asp Arg Leu Pro His Val His Gin Val 
170 175 180 185 

GCC CCC AAG GAC ACA CAT TTG TCC CAT GGC ATG GTC TCC TTG ATG TTT 688 
Ala Pro Lys Asp Thr His Leu Ser His Gly Met Val Ser Leu Met Phe 
190 195 200 

CAT TTT AGG TGG ACT TGG ATA GGA CTG GTC ATC TCA GAT GAT GAT CAG 736 
His Phe Arg Trp Thr Trp He Gly Leu Val He Ser Asp Asp Asp Gin 
205 210 215 

GGT ATT CAG TTT CTC TCA GAT TTA AGA GAA GAA AGC CAA AGG CAT GGG 784 
Gly He Gin Phe Leu Ser Asp Leu Arg Glu Glu Ser Gin Arg His Gly 
220 225 230 

ATC TGT TTG GCT TTT GTT AAT ATG ATC CCA GAA AAC ATG CAG ATA TAG 832 
He Cys Leu Ala Phe Val Asn Met He Pro Glu Asn Met Gin He Tyr 
235 240 245 

ATG ACA AGG GCT ACA ATA TAT GAT ACA CAA ATT ATG ACA TCT TCA GCA 88 0 

Met Thr Arg Ala Thr He Tyr Asp Thr Gin He Met Thr Ser Ser Ala 
250 255 260 265 

AAG GTT GTT ATC ATT TAT GGT GAC ATG AAC TCT ACT CTA GAA GCA AGC 928 
Lys Val Val He He Tyr Gly Asp Met Asn Ser Thr Leu Glu Ala Ser 
270 275 280 

TTT AGA AGA TGG GAA GAG TTA GGT GCT CGG AGA ATC TGG ATC ACA ACC 976 
Phe Arg Arg Trp Glu Glu Leu Gly Ala Arg Arg He Trp He Thr Thr 
285 290 295 

ACA CAA TGG GAT GTC ATC ACA AAT AAA AAA GAC TTC ACC CTT AAT CTC 1024 
Thr Gin Trp Asp Val He Thr Asn Lys Lys Asp Phe Thr Leu Asn Leu 
300 305 310 

TTC CAT GGG ACT ATT ACT TTT GCA CAC CAC AAA GAT GAG ATT CCT AAA 1072 
Phe His Gly Thr He Thr Phe Ala His His Lys Asp Glu He Pro Lys 
315 320 325 

TTT AGG AAT TTT ATG CAA ACA AAG AAA ACT GCC AAA TAC CTT GTA GAT 1120 
Phe Arg Asn Phe Met Gin Thr Lys Lys Thr Ala Lys Tyr Leu Val Asp 
330 335 340 345 



ATT TCT CAT ACT ATT TTG GAG TGG AAT TAT TTT AAT TGT TCA ATC TCT 



1168 
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Ile Ser His Thr He Leu Glu Trp Asn Tyr Phe Asn Cys Ser He Ser 
350 355 360 

AAG AAC AGC AGT AAA ATG GGT CAT TTT ACA TTC AAC AAC ACA TTG CAA 1216 
Lys Asn Ser Ser Lys Met Gly His Phe Thr Phe Asn Asn Thr Leu Gin 
365 370 375 . 

TGG ACA GCA CTG CAC AAC TAT GAT ATG GCC CTG AGC GAT GAA GGT TAC 1264 
Trp Thr Ala Leu His Asn Tyr Asp Met Ala Leu Ser Asp Glu Gly Tyr 
380 385 390 

AAT TTG TAT AAT GCT GTT TAT GCT GTG GCC CAC ACC TAC CAT GAA TAC 1312 
Asn Leu Tyr Asn Ala Val Tyr Ala Val Ala His Thr Tyr His Glu Tyr 
395 400 405 

ATT CTT CAA CAA GTA GAG TCT CAG AAA AAG GCA AAA CCC AAA AGA TAT 1360 
He Leu Gin Gin Val Glu Ser Gin Lys Lys Ala Lys Pro Lys Arg Tyr 
410 415 420 425 

TTC ACT GCT TGT CAG CAG GTG TCT TCC TTG ATG AAA ACC AGG GTA TTT 1408 
Phe Thr Ala Cys Gin Gin Val Ser Ser Leu Met Lys Thr Arg Val Phe 
430 435 440 

ATG AAC CCT GTT GGA GAA CTG GTG AAC ATG AAG CAT AGG GAA AAT CAG 1456 
Met Asn Pro Val Gly Glu Leu Val Asn Met Lys His Arg Glu Asn Gin 
445 450 455 

TGT ACA GAG TAT GAT ATT TTC ATC ATT TGG AAT TTT CCA CAA GGC CTT 1504 
Cys Thr Glu Tyr Asp He Phe He He Trp Asn Phe Pro Gin Gly Leu 
460 465 470 

GGA TTA AAA GTG AAA GTA GGA AGC TAT TTA CCT TGC TTT CCA AAG AGT 1552 
Gly Leu Lys Val Lys Val Gly Ser Tyr Leu Pro Cys Phe Pro Lys Ser 
475 480 485 

CAA CAA CTT CAT ATA GCT GAT GAT TTG GAA TGG GCC ATG GGA GGA ACA 1600 
Gin Gin Leu His He Ala Asp Asp Leu Glu Trp Ala Met Gly Gly Thr 
490 495 500 505 

TCA GTG GAT ATG GAA CAG TGT GTG AGA TGT CCA GAT AAT AAA TAT GCC 1648 
Ser Val Asp Met Glu Gin Cys Val Arg Cys Pro Asp Asn Lys Tyr Ala 
510 515 520 

AAT TTA GAG CAA ACC CAC TGC CTC CAA AGA ACG GTG TCA TTT CTG GCT 1696 
Asn Leu Glu Gin Thr His Cys Leu Gin Arg Thr Val Ser Phe Leu Ala 
525 530 535 

TAT GAA GAT CCA TTG GGG ATG GCT CTA GGC TGC ATG GCA CTG TCC TTC 1744 
Tyr Glu Asp Pro Leu Gly Met Ala Leu Gly Cys Met Ala Leu Ser Phe 
540 545 550 

TCG GCC ATC ACA ATT CTA GTC CTC GTC ACA TTT GTG AAG TAC AAG GAT 1792 
Ser Ala He Thr He Leu Val Leu Val Thr Phe Val Lys Tyr Lys Asp 
555 560 565 

ACT CCC ATT GTG AAG GCC AAT AAC CGC ATT CTC AGC TAC ATC CTG CTC 1840 
Thr Pro He Val Lys Ala Asn Asn Arg lie Leu Ser Tyr He Leu Leu 
570 575 580 585 

ATC TCT CTC GTC TTC TGC TTT CTC TGT TCC CTG CTC TTC ATT GGA CAT 1888 
He Ser Leu Val Phe Cys Phe Leu Cys Ser Leu Leu Phe He Gly His 
590 595 600 



CCC GAC CAG GTC ACC TGC ATC TTG CAG CAG ACC ACA TTT GGA GTA TTG 
Pro Asp Gin Val Thr Cys He Leu Gin Gin Thr Thr Phe Gly Val Leu 



1936 
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605 



610 



615 



TTC ACT GTG TCT GTT TCT ACA GTG TTG GCC AAA ACA ATA ACT GTG GTC 1984 
Phe Thr Val Ser Val Ser Thr Val Leu Ala Lys Thr lie Thr Val Val 
620 625 630 

ATG GCT TTC AAG CTC ACT ACT CCA GGA AGA AGG ATG AGA GGG ATG ATG 2032 
Met Ala Phe Lys Leu Thr Thr Pro Gly Arg Arg Met Arg Gly Met Met 
635 640 645 

ATG ACA GGG GCA CCT AAG TTG GTC ATT CCC ATT TGT ACC CTG ATC CAA 2080 
Met Thr Gly Ala Pro Lys Leu Val lie Pro lie Cys Thr Leu lie Gin 
650 655 660 665 

CTT GTT CTC TGT GGA ATC TGG TTG GTC ACA TCT CCT CCC TTT ATT GAC 2128 
Leu Val Leu Cys Gly He Trp Leu Val Thr Ser Pro Pro Phe He Asp 
670 675 680 

AGA GAT ATA CAA TCT GAA CAT GGG AAG ATT GTC ATT CTT TGC AAT AAA 2176 
Arg. Asp He Gin Ser Glu His Gly Lys He Val He Leu C^'S Asn L ,r s 
685 " 690 695 

GGC TCT GTC GTT GCC TTC CAC GTC GTC CTG GGA TAC TTG GGC TCC TTG 2224 
Gly Ser Val Val Ala Phe His Val Val Leu Gly Tyr Leu Gly Ser Leu 
700 705 710 

GCT CTG GGG AGC TTC ACT TTG GCT TTC TTG GCT AGG AAC CTT CCT GAC 2272 
Ala Leu Gly Ser Phe Thr Leu Ala Phe Leu Ala Arg Asn Leu Pro Asp 
715 720 725 

ACA TTC AAT GAA GCC AAG TTC CTA ACT TTC AGC ATG CTG GTG TTC TGC 2320 
Thr Phe Asn Glu Ala Lys Phe Leu Thr Phe Ser Met Leu Val Phe Cys 
730 735 740 745 

AGT GTC TGG ATC ACC TTC CTC CCT GTC TAC CAC AGC ACC AGG GGG AAG 2368 
Ser Val Trp He Thr Phe Leu Pro Val Tyr His Ser Thr Arg Gly Lys 
750 755 760 

GTC ATG GTG GTT GTG GAG GTT TTC TCC ATC TTG GCT TCT AGT GCA GGG 2416 
Val Met Val Val Val Glu Val Phe Ser He Leu Ala Ser Ser Ala Gly 
765 770 775 

TTG CTA ATG TGT ATC TTT GTC CCA AAG TGT TAT GTT ATT TTA ATT AGA 2464 
Leu Leu Met Cys He Phe Val Pro Lys Cys Tyr Val He Leu He Arg 
780 785 790 

CCA GAT TCA AAT TTT ATA CAG AAC CAC AAA GGT AAA TTG CTT TAT TGAAA 2514 
Pro Asp Ser Asn Phe He Gin Asn His Lys Gly Lys Leu Leu Tyr 
795 800 805 



CTTTCATGGT 
AGCAGTACTT 
AGCAAACATG 
GATCTGTGGT 
CTGTCTTTGA 
CATACACAAT 
TACTTCCAAG 
ATAAACACTT 



ATGAAAATGT 
CAT CAT AT AA 
AATATGTTGA 
TTTGTGTTTA 
ACAGCGCCAC 
GGACATGAAG 
TTCATGCCTT 
TCCTCACAAA 



TAGATGATAT 
AAAATAAAGT 
GAACTGGGAT 
AGCCATGTAC 
CTCTAGGCAT 
CCAGTAATCA 
GACTTTATTC 
AAAAAAA 



TCAACTTATC 
AATATACAGA 
TCTCAATTGA 
TTAATTAATG 
GCTGTCCTTG 
ACATTATTCC 
AATGTTCTAT 



TTATTCTTCA 
TTTATACTTA 
GGAATGGCTA 
ATTAACATGA 
AGTTATAAGA 
ACTTGCTTTC 
GACAAAGGTA 



TCTTAATAAA 
CAAACTGGAC 
CCAATATTTT 
GGTTACCCTA 
AAGGGTACTG 
ATGGAGTTCT 
GAATAAATAA 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 808 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 



2574 
2634 
2694 
2754 
2814 
2874 
2934 
2961 
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<D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 





(xi) ! 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO:< 


4: 










Met 


Lys 


Gin 


Leu 


Cys 


Thr 


Phe 


Thr 


He 


Ser 


Leu 


Leu 


Phe 


Leu 


Lys 


Phe 


1 








5 










10 










15 




Ser 


Leu 


lie 


Leu 


Cys 


Cys 


Trp 


Ser 


Glu 


Pro 


Ser 


Cys 


Phe 


Trp 


Arg 


He 








20 










25 








30 




Lys 


Lys 


Ser 


Glu 


Asp 


Asn 


Asp 


Gly 


Asp 


Leu 


Gin 


Arg 


Glu 


Cys 


His 


Phe 






35 










40 








45 






Tyr 


Leu 


Trp 


Lys 


Thr 


Asp 


Glu 


Pro 


He 


Glu 


Asp 


Ser 


Phe 


Tyr 


Asn 


Tyr 




50 










55 








60 






Asp 


Leu 


Ser 


Phe 


Arg 


He 


Ala 


Gly 


Ser 


Glu 


Tyr 


Glu 


Leu 


Leu 


Leu 


Val 


65 










70 










75 










80 


Met 


Phe 


Phe 


Ala 


Thr 


Asp 


Glu 


He 


Asn 


Lys Asn 


Pro 


Tyr 


Leu 


Leu 


Pro 










85 










90 








95 




Asn 


Met 


Ser 


Leu 


Met 


Phe 


Ser 


He 


He 


Gly 




Asn 


Cys 


His 


Asp 


Leu 








100 










105 










110 




Leu 


Arg 


Ser 


Leu 


Asp 


Gin 


Glu 


Tyr 


Ala 


Gin 


He 


Asp Gly 


His 


Met 


Asn 






115 










120 










125 








Phe 


Val 


Asn 


Tyr 


Phe 


Cys 


Tyr 


Leu 


Asp 


Asp 


Ser 


Cys 


Ala 


Thr 


Gly Leu 




130 










135 










140 










Thr Gly 


Pro 


Ser 


Trp 


Lys 


Thr 


Ser 


Leu 


Lys 


Leu 


Ala 


Met 


His 


Ser 


Ser 


145 










150 










155 










160 


Met 


Pro 


Leu 


Val 


Phe 


Phe 


Gly 


Pro 


Phe 


Asn 


Pro 


Asn 


Leu 


Arg Asp 


His 










165 










170 










175 




Asp 


Arg 


Leu 


Pro 


His 


Val 


His 


Gin 


Val 


Ala 


Pro 


Lys 


Asp 


Thr 


His 


Leu 








180 










185 






190 






Ser 


His 


Gly 


Met 


Val 


Ser 


Leu 


Met 


Phe 


His 


Phe 


Arg 


Trp 


Thr 


Trp 


He 






195 










200 










205 






Gly Leu 


val 


lie 


Ser 


Asp 


Asp 


Asp 


Gin 


Gly He 


Gin 


Phe 


Leu 


Ser Asp 




210 










215 










220 










Leu 


Arg 


Glu 


Glu 


Ser 


Gin 


Arg 


His 


Gly 


He 


Cys 


Leu 


Ala 


Phe 


Val 


Asn 


225 










230 










235 










240 


Met 


lie 


Pro 


Glu 


Asn 


Met 


Gin 


He 


Tyr 


Met 


Thr 


Arg 


Ala 


Thr 


He 


Tyr 










245 










250 










255 


Asp 


Thr 


Gin 


lie 


Met 


Thr 


Ser 


Ser 


Ala 


Lys 


Val 


Val 


He 


He 


Tyr 


Gly 








260 










265 










270 


Asp 


Met 


Asn 


Ser 


Thr 


Leu 


Glu 


Ala 


Ser 


Phe 


Arg 


Arg 


Trp 


Glu 


Glu 


Leu 






275 










280 








285 








Gly Ala 


Arg 


Arg 


He 


Trp 


He 


Thr 


Thr 


Thr 


Gin 


Trp 


Asp 


Val 


He 


Thr 




290 










295 










300 








Asn 


Lys 


Lys 


Asp 


Phe 


Thr 


Leu 


Asn 


Leu 


Phe 


His 


Gly Thr 


He 


Thr 


Phe 


305 










310 










315 










320 


Ala 


His 


His 


Lys 


Asp 


Glu 


He 


Pro 


Lys 


Phe Arg 


Asn 


Phe 


Met 


Gin 


Thr 










325 










330 










335 




Lys 


Lys 


Thr 


Ala 


Lys 


Tyr 


Leu 


Val 


Asp 


He 


Ser 


His 


Thr 


He 


Leu 


Glu 








340 










345 










350 






Trp 


Asn 


Tyr 


Phe 


Asn 


Cys 


Ser 


He 


Ser 


Lys 


Asn 


Ser 


Ser 


Lys 


Met 


Gly 






355 










360 










365 




His 


Phe 


Thr 


Phe 


Asn 


Asn 


Thr 


Leu 


Gin 


Trp 


Thr 


Ala 


Leu 


His 


Asn 


Tyr 




370 










375 










380 








Asp 


Met 


Ala 


Leu 


Ser 


Asp 


Glu 


Gly 


Tyr 


Asn 


Leu 


Tyr Asn 


Ala 


Val 


Tyr 


385 










390 










395 










400 


Ala 


Val 


Ala 


His 


Thr 


Tyr 


His 


Glu 


Tyr 


He 


Leu 


Gin 


Gin 


Val 


Glu 


Ser 










405 










410 










415 




Gin 


Lys 


Lys 


Ala 


Lys 


Pro 


Lys 


Arg 


Tyr 


Phe 


Thr 


Ala 


Cys 


Gin 


Gin 


Val 








420 










425 










430 






Ser 


Ser 


Leu 


Met 


Lys 


Thr 


Arg 


Val 


Phe 


Met 


Asn 


Pro 


Val 


Gly Glu 


Leu 






435 










440 










445 








Val 


Asn 


Met 


Lys 


His 


Arg 


Glu 


Asn 


Gin 


Cys 


Thr 


Glu 


Tyr 


Asp 


He 


Phe 




450 










455 










460 
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lie 


lie 


Trp 


Asn 


Phe 


Pro 


Gin 


Glv 


Leu 


Gly 

Jr 


Leu 


Lvs 


Val 


Lys 


Val 


Gly 


465 










470 










475 










480 


Ser 


Tvr 


Leu 


Pro 


Cys 


Phe 


Pro 


LVS 


Ser 


Gin 


Gin 


Leu 


His 


He 


Ala 


Asp 










485 










490 










495 


Asp 


Leu 


Glu 


Trp 


Ala 


Met 


Gly 


Glv 


Thr 


Ser 


Val 


Asp 


Met 


Glu 


Gin 


Cys 








500 










505 










3 J.U 




Val 


Arg 


Cys 


Pro 


Asp 


Asn 


Lys 


Tyr 


Ala 


Asn 


Leu 


Glu 


Gin 


Thr 


His 


Cys 






515 










520 










3^3 






Leu 


Gin 


Ara 


Thr 


Val 


Ser 


Phe 


Leu 


Ala 


Tvr 


Glu 




Pro 


UGU 


Gly Met 




530 










535 










3ti KJ 










Ala 


Leu 


Gly 


Cys 

Jr " 


Met 


Ala 


Leu 


Ser 


Phe 


Ser 


Ala 


He 


Thr 


Tie 


Leu 


Val 


545 










550 










3 3 3 










560 


Leu 


Val 


Thr 


Phe 


Val 


Lys 


Tvr 

Jr 


Lvs 


Asp 


Thr 


Pro 


He 


val 


Lys 


Ala 


Asn 










565 










570 








575 




Asn 


Arg 


He 


Leu 


Ser 


Tyr 


He 


Leu 


Leu 


He 


Ser 


Leu 


Val 


Phe 


Cys 


Phe 








580 










585 










con 




Leu 


Cys 


Ser 


Leu 


Leu 


Phe 


He 


Gly 


His 


Pro 


Asp 


Gin 


Val 


Thr 


Cys 


He 






595 










600 










605 






Leu 


Gin 
€10 


Gin 


Thr 


Thr 


Phe 


Glv 
615 


Val 


Leu 


Phe 


Thr 


Val 

■6-2-0 


Ser 


Val 


Ser 


Thr 


Val 


Leu 


Ala 


Lys 


Thr 


He 


Thr 


Val 


Val 


Met 


Ala 


Phe 


Lys 


Leu 


Thr 


Thr 


625 




















003 








c a r\ 
640 


Pro 


Gly 


Arg 


Arg 


Met 


Arg 


Gly 


Met 


Met 


Met 


Thr 


Gly Ala 


Pro 


Lys 


Leu 










645 










650 










655 




Val 


He 


Pro 


He 


Cys 


Thr 


Leu 


He 


Gin 


Leu 


Val 


Leu 


Cys 


Gly 


He 


Trp 








660 










665 










© / u 




Leu 


Val 


Thr 
675 


Ser 


Pro 


Pro 


Phe 


He 
680 


Asp 


'"'3 


Asp 


He 


Gin 
685 


Ser 


Glu 


His 


Gly 


Lys 
690 


He 


Val 


He 


Leu 


Cvs 
695 


Asn 


Lvs 


Glv 


Ser 


Val 
700 


Val 


Ala 


Phe 


His 


Val 


Val 


Leu 


Gly 


Tyr 


Leu 


Gly 


Ser 


Leu 


Ala 


Leu 


Gly Ser 


Phe 


Thr 


Leu 


705 










710 










715 










720 


Ala 


Phe 


Leu 


Ala 


Arg 
725 


Asn 


Leu 


Pro 


Asp 


Thr 
730 


Phe 


Asn 


Glu 


Ala 


Lys 
735 


Phe 


Leu 


Thr 


Phe 


Ser 
740 


Met 


Leu 


Val 


Phe 


Cys 
745 


Ser 


Val 


Trp 


He 


Thr 
750 


Phe 


Leu 


Pro 


Val 


Tyr 
755 


His 


Ser 


Thr 


Arg 


Gly 
760 


Lys 


Val 


Met 


Val 


Val 
765 


Val 


Glu 


Val 


Phe 


Ser 


He 


Leu 


Ala 


Ser 


Ser 


Ala 


Gly 


Leu 


Leu 


Met 


Cys 


He 


Phe 


Val 




770 










775 










780 








Pro 


Lys 


Cys 


Tyr 


Val 


He 


Leu 


He 


Arg 


Pro 


Asp 


Ser 


Asn 


Phe 


He 


Gin 


785 










790 










795 










800 


Asn 


His 


Lys 


Gly 


Lys 


Leu 


Leu 


Tyr 



















805 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2907 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(ix) FEATURE: 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 1...2409 

(D) OTHER INFORMATION: VR3 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 



CAT 
His 



TTT TAC CTT GGG GCA GTT GAT AAA CCA ATT GAA GAT AAT TTT TAT 
Phe Tyr Leu Gly Ala Val Asp Lys Pro He Glu Asp Asn Phe Tyr 



48 



WO 59/00422 



-74- 



PCT/US98/13680 



15 10 15 

AAT TCA CTT TTA AAG TTT AGA ATT GCA GCA AGT GAA TAT GAG TTT CTT 96 
Asn Ser Leu Leu Lys Phe Arg He Ala Ala Ser Glu Tyr Glu Phe Leu 
20 25 30 

CTG GTA ATG TTT TTT GCT ACT GAT GAG ATC AAC AAG AAT CCT TAT CTT 144 
Leu Val Met Phe Phe Ala Thr Asp Glu He Asn Lys Asn Pro Tyr Leu 
35 40 45 

TTA CCC AAC ATA ACT TTG ATG TTC TCC ATC ATT GGT GGA AAC TGT CAT 192 
Leu Pro Asn He Thr Leu Met Phe Ser He He Gly Gly Asn Cys His 
50 55 €0 

GAT TTA TTG AGA GGT TTG GAT CAA GCA TAT ACA CAA ATA AAT GGA CAT 240 
Asp Leu Leu Arg Gly Leu Asp Gin Ala Tyr Thr Gin He Asn Gly His 
65 70 75 80 

ATG AAT TTT GTT AAT TAT TTC TGT TAT TTA GAT GAT TCA TGT GCC ATA 288 
Met Asn Phe Val Asn Tyr Phe Cys Tyr Leu Asp Asp Ser Cys Ala He 
85 90 95 

GGT CTT ACA GGA CCA TCA TGG AAA ACA TCC TTA AAT CTG GCA ATG CAT 336 
Gly Leu Thr Gly Pro Ser Trp Lys Thr Ser Leu Asn Leu Ala Met His 
100 105 HO 

TCT.TCA ATG CCA CTG GTT TTC TTT GGA TCA TTT AAT CCT AAC CTA CAT 384 
Ser Ser Met Pro Leu Val Phe Phe Gly Ser Phe Asn Pro Asn Leu His 
115 120 125 

GAC CAT GAC CGG CTG CAC CAT GTC CAT CAA GTA GCC ACC AAG GAC ACA 432 
Asp His Asp Arg Leu His His Val His Gin Val Ala Thr Lys Asp Thr 
130 135 140 

CAT TTG TCC CAT GGC ATT GTC TCC TTG ATG TTT CAT TTT AGA TGG ACT 480 
His Leu Ser His Gly He Val Ser Leu Met Phe His Phe Arg Trp Thr 
145 150 155 160 

TGG ATA GGA CTG GTC ATC TCA GAT GAT GAC AAG GGT ATT CAG TTT CTC 528 
Trp He Gly Leu Val He Ser Asp Asp Asp Lys Gly He Gin Phe Leu 
165 170 175 

TCA GAT TTA AGA GAA GAA AGC CAA AGG CAT GGG ATC TGT TTA GCT TTT 576 
Ser Asp Leu Arg Glu Glu Ser Gin Arg His Gly He Cys Leu Ala Phe 
180 185 190 

GTT AAT ATG ATC CCA GAA AAC ATG CAG ATA TAC ATG ACA AGG GCT ACA 624 
Val Asn Met He Pro Glu Asn Met Gin He Tyr Met Thr Arg Ala Thr 
195 200 205 

ATA TAT GAT AAA CAA ATT ATG ACG TCT TTA GCA AAA GTT GTT ATC ATT 672 
He Tyr Asp Lys Gin He Met Thr Ser Leu Ala Lys Val Val He He 
210 215 220 

TAT GGT GAA ATG AAC TCT ACA CTA GAA GTA AGC TTT AGA AGA TGG GAA 720 
Tyr Gly Glu Met Asn Ser Thr Leu Glu Val Ser Phe Arg Arg Trp Glu 
225 230 235 240 

AAT TTA GGT GCT CGG AGA ATC TGG ATC ACA ACC TCA CAA TGG GAT GTC 768 
Asn Leu Gly Ala Arg Arg He Trp He Thr Thr Ser Gin Trp Asp Val 
245 250 255 

ATC ACA AAT AAA AAA GAA TTC ACC CTT AAT CTC TTC CAT GGG ACT ATT 816 
He Thr Asn Lys Lys Glu Phe Thr Leu Asn Leu Phe His Gly Thr He 
260 265 270 



WO 519/00422 



PCT/US98/13680 



-75- 

ACT TTT GCA CAC CGC AGA TTT GAG ATT CCT AAA TTT AAA AAA TTT ATG 864 
Thr Phe Ala His Arg Arg Phe Glu lie Pro Lys Phe Lys Lys Phe Met 
275 280 285 

CAA ACA ATG AAC ACT GCC AAA TAC CCA GTA GAT ATT TCT CAT ACT ATA 912 
Gin Thr Met Asn Thr Ala Lys Tyr Pro Val Asp He Ser His Thr He 
290 295 300 

TTG GAG TGG AAT TAT TTT AAT TGT TCA ATC TCT AAG AAC AGC AGT AAA 960 
Leu Glu Trp Asn Tyr Phe Asn Cys Ser He Ser Lys Asn Ser Ser Lys 
305 310 315 320 

ATG GAT CAT ATT ACA TTC AAC AAC ACA TTG GAA TGG ACA GCA CTG CAC 1008 
Met Asp His He Thr Phe Asn Asn Thr Leu Glu Trp Thr Ala Leu His 
325 330 335 

AAC TAT GAT ATG GTG ATG AGT GAT GAA GGT TAC AAT TTG TAT AAT GCT 1056 
Asn Tyr Asp Met Val Met Ser Asp Glu Gly Tyr Asn Leu Tyr Asn Ala 
340 345 350 

GTT TAT GCT GTG GCC CAC ACC TAC CAT GAA CAT ATT TTT CAA CAA GTA 1104 
Val Tyr Ala Val Ala His Thr Tyr His Glu His He Phe Gin Gin Val 
355 360 365 

GAG TCT CAG AAA AAG GCA AAA CCC AAA AGA TTT TTC ACT GTT TGT CAG 1152 
Glu Ser Gin Lys Lys Ala Lys Pro Lys Arg Phe Phe Thr Val Cys Gin 
370 375 380 

CAG GTG TCT TCC TTG ATG AAA ACC AGG GTA TTT ACT AAC CCT GTT GGA 12 00 
Gin Val Ser Ser Leu Met Lys Thr Arg Val Phe Thr Asn Pro Val Gly 
385 390 395 400 

GAA CTG GTG AAC ATG AAG CAT AGG GAA AAT CAG TGT ACA GAG TAT GAC 1248 
Glu Leu Val Asn Met Lys His Arg Glu Asn Gin Cys Thr Glu Tyr Asp 
405 410 415 

ATT TTC CTC ATT TGG AAC TTT CCA CAA GGC CTT GGA TTA AAA GTG AAA 1296 
He Phe Leu He Trp Asn Phe Pro Gin Gly Leu Gly Leu Lys Val Lys 
420 425 430 

ATA GGA AGC TAT TTA CCT TGT TTT CCA CAG AGA CAA GAA CTT CAT ATA 1344 
He Gly Ser Tyr Leu Pro Cys Phe Pro Gin Arg Gin Glu Leu His He 
435 440 445 

TCT GAT GAT TTG GAA TGG GCC ATG GGA GGA ACA TCA GTG GTT CCC TCC 13 92 
Ser Asp Asp Leu Glu Trp Ala Met Gly Gly Thr Ser Val Val Pro Ser 
450 455 460 

TCT GTG TGT AGT GTG GCA TGT ACT GCA GGA TTC AGG AAA ATT CAT CAG 1440 
Ser Val Cys Ser Val Ala Cys Thr Ala Gly Phe Arg Lys He His Gin 
465 470 475 480 

AAA GAA ACA GCA GAC TGC TGC TTT GAT TGT GTT CAG TGC CCA GAA AAT 1488 
Lys Glu Thr Ala Asp Cys Cys Phe Asp Cys Val Gin Cys Pro Glu Asn 
485 490 495 

GAG GTT TCC AAT GAA ACA GAT ATG GAA CAG TGT GTG AAG TGT CCA TAT 1536 
Glu Val Ser Asn Glu Thr Asp Met Glu Gin Cys Val Lys Cys Pro Tyr 
500 505 510 

GAT AAG TAT GCC AAC ATA GAG AAA ACC CAC TGC CTC TCA AGA GCT GTA 1584 
Asp Lys Tyr Ala Asn He Glu Lys Thr His Cys Leu Ser Arg Ala Val 
515 520 525 



TCA TTT CTG GCT TAT GAA GAT CCA TTG GGG ATA GCT CTA GGC TGC ATA 



1632 
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Ser Phe Leu Ala Tyr Glu Asp Pro Leu Gly lie Ala Leu Gly Cys lie 
530 535 540 

GCA CTG TCC TTC TCA GCC ATC ACA ATT CTA GTA CTA ATC ACA TTT TTG 1680 
Ala Leu Ser Phe Ser Ala lie Thr He Leu Val Leu He Thr Phe Leu 
545 550 555 560 

AAG TAC AAG GAT ACT CCC ATT GTG AAG GCC AAT AAC CGC ATT CTC AGC 1728 
Lys Tyr Lys Asp Thr Pro He Val Lys Ala Asn Asn Arg He Leu Ser 
565 570 575 

TAC ATC CTG CTC ATC TCT CTA GTC TTC TGC TTT CTC TGC TCC CTG CTC 1776 
Tyr He Leu Leu He Ser Leu Val Phe Cys Phe Leu Cys Ser Leu Leu 
580 585 590 

TTC ATT GGA CAT CCA AAC CAG GTC TCC TGC GTC TTG CAG CAG ACC ACA 1824 
Phe He Gly His Pro Asn Gin Val Ser Cys Val Leu Gin Gin Thr Thr 
595 600 605 

TTT GGA GTA TTT TTC ACT GTG TCT GTT TCT ACA GTG TTG GCC AAA ACA 1972 
Phe Gly Val Phe Phe Thr Val Ser Val Ser Thr Val Leu Ala Lys Thr 
610 615 620 

ATA ACT GTG GTC ATG GCT TTC AAG CTC ACT ACT CCA GGA AGA AGA ATG 1920 
He Thr Val Val Met Ala Phe Lys Leu Thr Thr Pro Gly Arg Arg Met 
625 630 635 640 

AGA GAG ATG TTG GTA ACA GGG GCA CCT AAG TTG GTC ATT CCC ATT TGT 1968 
Arg Glu Met Leu Val Thr Gly Ala Pro Lys Leu Val He Pro He Cys 
645 650 655 



ACC CTA ATC CAA TTT GTT CTC TGT GGA ATC TGG TTG ATA ACA TCT CCT 2016 

Thr Leu He Gin Phe Val Leu Cys Gly He Trp Leu He Thr Ser Pro 
660 665 670 

CCA TTT ATT GAC AGA GAT ATA CAA TCT GAG CAT GGG AAG ATT GTC ATT 2064 

Pro Phe He Asp Arg Asp He Gin Ser Glu His Gly Lys He Val He 
675 680 685 

CTT TGC AAT AAA GGC TCT GTC ATT GCC TTC CAT GTT GTC CTG GGA TAC 2112 

Leu Cys Asn Lys Gly Ser Val He Ala Phe His Val Val Leu Gly Tyr 
690 695 700 

TTG GGC TCC TTG GCT CTG GGG AGC TTC ACT TTG GCT TTC TTG GCT AGG 2160 

Leu Gly Ser Leu Ala Leu Gly Ser Phe Thr Leu Ala Phe Leu Ala Arg 

705 710 715 720 



AAC CTT CCT GAC ACA TTC AAT GAA GCC AAA TTC CTG ACT TTC AGC ATG 2208 

Asn Leu Pro Asp Thr Phe Asn Glu Ala Lys Phe Leu Thr Phe Ser Met 
725 730 735 

CTG GTG TTC TGC AGT GTC TGG ATC ACC TTT CTC CCT GTC TAC CAT AGC 2256 

Leu Val Phe Cys Ser Val Trp He Thr Phe Leu Pro Val Tyr His Ser 
740 745 750 

ACC AGG GGG AAG GTC ATG GTG GTT GTG GAG GTT TTC TCA ATC TTG GCT 23 04 

Thr Arg Gly Lys Val Met Val Val Val Glu Val Phe Ser He Leu Ala 
755 760 765 



TCT AGT GCA GGG TTG CTA ATG TGT ATC TTT GTC CCA AAG TGT TAT GTT 2352 
Ser Ser Ala Gly Leu Leu Met Cys He Phe Val Pro Lys Cys Tyr Val 
770 775 780 

ATT TTA GTT AGA CCA GAT TCA AAT TTT ATA CGG AAG TAC AAA GAT AAA 2400 
lie Leu Val Arg Pro Asp Ser Asn Phe He Arg Lys Tyr Lys Asp Lys . 
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785 790 795 800 

TTT CGT TAT TGAAATATTC ATACTATGAA AATGTTAGAT TATACTCAAC ATATTTTTC 2458 
Phe Arg Tyx 



TTTGTCTTAA CAAAAGTAGT ACTTAATCTT ATAAAAATTT AAATAATATA CAAATTTGAA 2 518 

CTTACAAACA GGACAGAACT GTCTATTGTA ATACCAATTA CAAAACTTTG GTGAAAAATG 2578 

GTCTCATTCA TAAGGACACA ATTCTGAAGA TATTGAGAAC CAGGAATCTC AACTGCGGAA 263 8 

ACGCTACCAT CATCCTGACC TGTGGTTTTG TGTGTAAAGC ATGAACTTAA TTAATGATTA 2698 

ATATAAGGTG ACCATACTGA CTGTGAACAC TACCATCTCT GGGCAAGTTG TTCTTGTAGT 2758 

TG TAAG AAAA AGCTCTGAAG ACAACATGGA AGTAAAGCCA GTAATCACCA TTATCCCTCA 2818 

TGCTTTCATG GAGTGGCTGC ATCCAATTTC ATGCCTTGGC TTCATTCAAT ATACTGTGAC 2878 

CAAGGTACAT AAGTAAAGAA ACACTTTTC 2907 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 803 amino acids 
(H) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



His 


Phe 


Tyr 


Leu 


Gly 


Ala 


Val 


Asp 


Lys 


Pro 


lie 


Glu 


Asp 


Asn 


Phe 


Tyr 


l 








5 










10 










15 




Asn 


Ser 


Leu 


Leu 


Lys 


Phe 


Arg 


He 


Ala 


Ala 


Ser 


Glu 


Tyr 


Glu 


Phe 


Leu 








20 










25 








30 






Leu 


Val 


Met 


Phe 


Phe 


Ala 


Thr 


Asp 


Glu 


He 


Asn 


Lys 


Asn 


Pro 


Tyr 


Leu 






35 










40 










45 






Leu 


Pro 


Asn 


He 


Thr 


Leu 


Met 


Phe 


Ser 


He 


He 


Gly Gly Asn Cys His 




50 










55 










60 










Asp 


Leu 


Leu 


Arg 


Gly 


Leu 


Asp 


Gin 


Ala 


Tyr 


Thr 


Gin 


He Asn Gly His 


65 










70 










75 










80 


Met 


Asn 


Phe 


Val 


Asn 


Tyr 


Phe 


Cys 


Tyr 


Leu 


Asp 


Asp 


Ser 


Cys 


Ala 


He 










85 










90 










95 




Gly 


Leu 


Thr 


Gly 


Pro 


Ser 


Trp 


Lys 


Thr 


Ser 


Leu 


Asn 


Leu 


Ala 


Met 


His 








100 










105 










110 






Ser 


Ser 


Met 


Pro 


Leu 


Val 


Phe 


Phe 


Gly 


Ser 


Phe 


Asn 


Pro 


Asn 


Leu 


His 






115 










120 










125 








Asp 


His 


Asp 


Arg 


Leu 


His 


His 


Val 


His 


Gin 


Val 


Ala 


Thr 


Lys 


Asp 


Thr 




130 










135 










140 










His 


Leu 


Ser 


His 


Gly 


He 


Val 


Ser 


Leu 


Met 


Phe 


His 


Phe 


Arg 


Trp 


Thr 


145 










150 










155 






160 


Trp 


He 


Gly 


Leu 


Val 


He 


Ser 


Asp 


Asp 


Asp 


Lys 


Gly 


He 


Gin 


Phe 


Leu 










165 










170 










175 




Ser 


Asp 


Leu 


Arg 


Glu 


Glu 


Ser 


Gin 


Arg 


His 


Gly 


He 


Cys 


Leu 


Ala 


Phe 








180 










185 










190 






Val 


Asn 


Met 


He 


Pro 


Glu 


Asn 


Met 


Gin 


lie 


Tyr 


Met 


Thr Arg Ala 


Thr 






195 










200 










205 








lie 


Tyr 


Asp 


Lys 


Gin 


He 


Met 


Thr 


Ser 


Leu 


Ala 


Lys Val 


Val 


He 


He 




210 










215 










220 










Tyr 


Gly 


Glu 


Met 


Asn 


Ser 


Thr 


Leu 


Glu 


Val 


Ser 


Phe 


Arg Arg Trp 


Glu 


225 










230 










235 










240 


Asn 


Leu 


Gly 


Ala 


Arg 


Arg 


He 


Trp 


He 


Thr 


Thr 


Ser 


Gin 


Trp 


Asp 


Val 










245 










250 










255 




He 


Thr 


Asn 


Lys 


Lys 


Glu 


Phe 


Thr 


Leu 


Asn 


Leu 


Phe 


His 


Gly Thr 


He 








260 










265 










270 






Thr 


Phe 


Ala 


His 


Arg 


Arg 


Phe 


Glu 


He 


Pro 


Lys 


Phe 


Lys 


Lys 


Phe 


Met 






275 










280 










285 








Gin 


Thr 


Met 


Asn 


Thr 


Ala 


Lys 


Tyr 


Pro 


Val 


Asp 


He 


Ser 


His 


Thr 


He 
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290 






Leu 


Glu 


Trp 


Asn 


305 








Met 


Asp 


His 


He 


As xi 


Tyr 


Asp Met 








340 


Val 


Tyr 


Ala 


Val 






355 




Glu 


Ser 




Lys 




370 




Gin 


Val 


Ser 


Ser 


385 








Glu 


Leu 


Val 


Asn 




pne 


Leu 


He 








420 


Tie 


Gly 


Ser 


Tyr 






435 




Ser 


& on 

XT 


- — - c 


Leu 




450 






Ser 


Val 


Cys 


Ser 


465 








Lys 


Glu 


Thr 


Ala 


Glu 


Val 


Ser 


Asn 








500 


ASp 


Lys 


Tyr 


Ala 






515 




Ser 


Phe 


Leu 


Ala 




530 






Ala 


Leu 


Ser 


Phe 


545 










Tyr 


Lys 


Asp 


Tyr 


Tl £x 
lie 


Leu 


Leu 








580 


irlie 


lie 


Gly His 






595 




Piie 


Gly 


Val 


Phe 




610 






lie 


Thr 


Val 


Val 


625 








Arg 


ulu 


Met 


Leu 




u€U 


He 


Gin 








660 


Pro 


piie 


He 


Asp 






675 




lieu 


Cys 


Asn 


Lys 




690 






Leu 


Gly 


Ser 


Leu 


705 








Asn 


Leu 


Pro 


Asp 


Leu 


Val 


Phe 


Cys 








740 


Thr 


Arg 


Gly 


Lys 






755 




Ser 


Ser 


Ala 


Gly 




770 






lie 


Leu 


Val 


Arg 


785 








Phe 


Arg 


Tyr 









295 




Tyr 


jrne 


Asn 


Cys 




310 






Thr 


Phe 


Asn 


Asn 


325 








val 


Met 


Ser 


Asp 


Ala 


His 


Thr 


Tyr 








360 


Lys 


Ala 


Lys 


Pro 






375 




Leu 


Met 


Lys 


Thr 




390 






Met 


Lys 


His 


Arg 


405 








Trp 


Asn 


Phe 


Pro 


Lieu 


Pro 


Cys 


Pne 








440 


Glu 


Tm 
zr 


Ala 


Me** 






455 




Val 


Ala 


Cys 


Thr 




470 






Asp 


Cys 


Cys 


Phe 


485 








Glu 


Thr Asp 


Met 


Asn 


He 


Glu 


Lys 








520 


Tyr 


Glu 


Asp 


Pro 






535 




Ser 


Ala 


He 


Thr 




550 






Thr 


Pro 


He 


Val 


565 








lie 


Ser 


Leu 


val 


Pro 


Asn 


Gin 


val 








600 


Pne 


Thr 


Val 


Ser 






615 




Met 


Ala 


Phe 


Lys 




630 






Val 


Thr Gly 


Ala 


645 








Piie 


Val 


Leu 


Cys 


Arg 


Asp 


He 


Gin 








680 


Gly 


Ser 


Val 


He 






695 




Ala 


Leu Gly 


Ser 




710 






Thr 


Phe 


Asn 


Glu 


725 








Ser 


Val 


Trp 


He 


Val 


Met 


Val 


Val 








760 


Leu 


Leu 


Met 


Cys 






775 


Pro 


Asp 


Ser 


Asn 




790 









78- 












300 


Ser 


Ile 


Ser 


Lys 






315 




Thr 


Leu 


Glu 


Trp 




330 






Glu 


Gly Tyr 


Asn 


345 








His 


Glu 


His 


He 


Lys 


Arg 


Phe 


Phe 








380 


Arg 


Val 


Phe 


Thr 






395 




Glu 


Asn 


Gin 


Cys 




410 




Gin 


Gly Leu 


Gly 


425 








Pro 


Gin 


Arg 


Gin 


Gly 


Gly 


Thr 


Ser 








460 


Ala 


Gly 


Phe 


Arg 






475 




Asp 


Cys 


Val 


Gin 




490 






Glu 


Gin 


Cys 


Val 


505 








Thr 


His 


Cys 


Leu 


Leu 


Gly 


He 


Ala 








540 


He 


Leu 


Val 


Leu 






555 




Lys 


Ala 


Asn 


Asn 




570 






Phe 


Cys 


Phe 


Leu 


565 








Ser 


Cys 


Val 


Leu 


Val 


Ser 


Thr 


Val 








620 


Leu 


Thr 


Thr 


Pro 






635 




Pro 


Lys 


Leu 


Val 




650 






Gly 


He 


Trp 


Leu 


665 








Ser 


Glu 


His 


Gly 


Ala 


Phe 


His 


val 








700 


Phe 


Thr 


Leu 


Ala 






715 




Ala 


Lys 


Phe 


Leu 




730 






Thr 


Phe 


Leu 


Pro 


745 








Val 


Glu 


Val 


Phe 


He 


Phe 


Val 


Pro 








780 


Phe 


He 


Arg 


Lys 






795 





Asn 


Ser 


ser 


Lys 








i "i r\ 


Thr 


Ala 


Leu 


His 






335 




Leu 


Tyr 


Asn 


Ala 




350 






Phe 


Gin 


Gin 


Val 


■a ^ e 
365 








Thr 


Val 


Cys 


Gin 


Asn 


Pro 


Val 


Gly 








400 


Thr 


Glu 


Tyr 


Asp 






415 




Leu 


Lys 


Val 


Lys 




430 






Glu 


Leu 


His 


He 


445 








Val 


Val 


Pro 


Ser 


Lys 


He 


His 


Gin 








480 


Cys 


Pro 


Glu 


Asn 






495 




Lys 


Cys 


Pro 


Tyr 




510 






Ser 


Arg 


Ala 


Val 










Leu 


Gly 


Cys 


He 


He 


Thr 


Phe 


Leu 








560 


Arg 


He 


Leu 


Ser 






575 




Cys 


Ser 


Leu 


Leu 




590 






Gin 


Gin 


Thr 


Thr 


605 








Leu 


Ala 


Lys 


Thr 


Gly 


Arg 


Arg 


Met 








640 


He 


Pro 


He 


Cys 






655 




lie 


Thr 


Ser 


Pro 




670 






Lys 


He 


Val 


He 


DOS 








Val 


Leu 


Gly 


Tyr 


Phe 


Leu 


Ala 


Arg 








720 


Thr 


Phe 


Ser 


Met 






735 




Val 


Tyr 


His 


Ser 




750 






Ser 


He 


Leu 


Ala 


765 








Lys 


Cys 


Tyr 


Val 


Tyr 


Lys 


Asp 


Lys 



800 
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(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3625 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 
( ix) FEATURE : 



(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 117... 2672 
(D) OTHER INFORMATION: VR4 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

TGAATATGCA ATAAACCTCA CATTTGCACA AAGAAATAAA AGCTGGTAGA AATCTGATGT 60 
GCTGATATGC ATGGCACTTC ACAATCCG CA CTGCCCAGGT TTAAGGCAGG AAAAAG ATG 119 

Met 
1 

TTC ATT TTC ATG GGA GTC TTC TTC CTA CTT AAT ATT ACA CTT CTC ATG 167 
Phe lie Phe Met Gly Val Phe Phe Leu Leu Asn lie Thr Leu Leu Met 
5 10 15 

GCC AAT TTC ATT GAT CCC AGG TGC TTT TGG AGA ATA AAT TTG GAT GAA 215 
Ala Asn Phe lie Asp Pro Arg Cys Phe Trp Arg lie Asn Leu Asp Glu 
20 25 30 

ATA ACG GAT GAA TAT TTG GGA TTA TCT TGT GCT TTC ATC CTG GCA GCT 263 
lie Thr Asp Glu Tyr Leu Gly Leu Ser Cys Ala Phe lie Leu Ala Ala 
35 40 45 

GTT CAG ACA CCC ATT GAA AAA GAT TAT TTC AAC ACG ACT CTT AAT TTT 311 
Val Gin Thr Pro He Glu Lys Asp Tyr Phe Asn Thr Thr Leu Asn Phe 
50 55 60 65 

CTA AAA ACT ACT AAA AAC CAC AAA TAT GCT TTG GCA TTG GTG TTT GCA 359 
Leu Lys Thr Thr Lys Asn His Lys Tyr Ala Leu Ala Leu Val Phe Ala 
70 75 80 

ATG GAT GAA ATC AAC AGA TAT CCT GAT CTT TTA CCA AAT ATG TCT TTG 407 
Met Asp Glu He Asn Arg Tyr Pro Asp Leu Leu Pro Asn Met Ser Leu 
85 90 95 

ATT ATC AGA TAC TCT TTG GGC CAT TGT GAT GGA AAA ACT GTA ACA CCT 455 
He He Arg Tyr Ser Leu Gly His Cys Asp Gly Lys Thr Val Thr Pro 
100 105 110 

ACA CCA TAT TTA TTT CAT AGA AAA AAG CAA AGC CCT ATT CCT AAT TAT 503 
Thr Pro Tyr Leu Phe Hie Arg Lys Lys Gin Ser Pro He Pro Asn Tyr 
115 120 125 

TTC TGT AAT GAA GAG AGT ATG TGT TCA TTT CTG CTT TCA GGA CCC AAT 551 
Phe Cys Asn Glu Glu Ser Met Cys Ser Phe Leu Leu Ser Gly Pro Asn 
130 135 140 145 

TGG GAT GAA TCT TTA AGT TTC TGG AAG TAC CTG GAC AGC TTC TTA TCT 599 
Trp Asp Glu Ser Leu Ser Phe Trp Lys Tyr Leu Asp Ser Phe Leu Ser 
150 155 160 



CCA CGT ATC CTT CAG CTT TCC TAT GGA TCT TTC AGT TCC ATC TTC AGT 
Pro. Arg He Leu Gin Leu Ser Tyr Gly Ser Phe Ser Ser He Phe Ser 



647 
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165 170 175 

GAT GAT GAA CAA TAT CCC TAT CTC TAT CAG ATG GCC CCA AAA GAC ACA 695 
Asp* Asp Glu Gin Tyr Pro Tyr Leu Tyr Gin Met Ala Pro Lys Asp Thr 
180 185 190 

TCT CTA GCA TTG GCA ATG GTC TCC TTC ATA CTT TAT TTG AAA TGG AAT 743 
Ser Leu Ala Leu Ala Met Val Ser Phe lie Leu Tyr Leu Lys Trp Asn 
195 200 205 

TGG ATT GGC CTT GTC ATC CCA GAT GAT GAT CAA GGA AAC CAA TTT CTT 791 
Trp lie Gly Leu Val lie Pro Asp Asp Asp Gin Gly Asn Gin Phe Leu 
210 215 220 225 

TTA GAG TTG AAG AAA CAG AGT GAA AAC AAA GAA ATT TGC TTT GCC TTT 83 9 

Leu Glu Leu Lys Lys Gin Ser Glu Asn Lys Glu lie Cys Phe Ala Phe 
230 235 240 

GTG AAA ATG ATC TCT GTT GAT GAA GTT TCA TTT CCA CAA AAA ACT GAA 887 
Val Lvs Met lie Ser Val Asr> Glu VaT Ser t>Vio t>^ 

245 * 250 255 **** 

ATA AAC TAC AAA CAA ATT GTG AAG TCA CTA ACA AAT GTT ATT ATC ATT 935 
He Asn Tyr Lys Gin He Val Lys Ser Leu Thr Asn Val He lie He 
260 265 270 

TAT GGA GAA ACA TAT AAT TTC ATT GAT TTG ATC TTC AGA ATG TGG GAA 983 
Tyr Gly Glu Thr Tyr Asn Phe He Asp Leu He Phe Arg Met Trp Glu 
275 280 285 

CCT CCC ATT TTA CAG AGA ATA TGG ATC ACC ACA AAA CAA TTG AAT TTC 1031 
Pro Pro He Leu Gin Arg He Trp He Thr Thr Lys Gin Leu Asn Phe 
290 295 300 305 

CCT ACC AGT AAG ACA GAC ATA AGT CAT GAC ACA TTC TAT GGA TCA CTT 1079 
Pro Thr Ser Lys Thr Asp He Ser His Asp Thr Phe Tyr Gly Ser Leu 
310 315 320 

ACT TTT CTA CCC CAC CAT GGT GAG ATT TCT GGC TTT AAA AAT TTT GTA 1127 
Thr Phe Leu Pro His His Gly Glu He Ser Gly Phe Lys Asn Phe Val 
325 330 335 

CAG ACA TGG TTC CAT CTC AGA AAC ACA GAT TTA TGT CTA GTA ATG CCA 1175 
Gin Thr Trp Phe His Leu Arg Asn Thr Asp Leu Cys Leu Val Met Pro 
340 345 350 

GAG TGG AAA TAT ATT AAC TCT GAA GAC TCA GCA TCT AAT TGT AAA ATA 1223 
Glu Trp Lys Tyr He Asn Ser Glu Asp Ser Ala Ser Asn Cys Lys He 
355 360 365 

CTT AAG AAC AGT TCA TCT GAT GCC TCA TTT GAT TGG CTA ATG GAA GAG 1271 
Leu Lys Asn Ser Ser Ser Asp Ala Ser Phe Asp Trp Leu Met Glu Glu 
370 375 380 385 

AAG CTT GAC ATG GCC TTT AGT GAG AAT AGT CAT AAC ATA TAT AAT GCT 1319 
Lys Leu Asp Met Ala Phe Ser Glu Asn Ser His Asn He Tyr Asn Ala 
390 395 400 

GTG CAT GCC ATA GCC CAT GCC CTC CAT GAG ATG AAT CTG CAA CAG GCT 1367 
Val His Ala He Ala His Ala Leu His Glu Met Asn Leu Gin Gin Ala 
405 410 415 

GAT AAT CAG GCA ATA GAT AAT GGA AAA GGA GCC AGT TCT CAC TGC TTG 1415 
Asp Asn Gin Ala He Asp Asn Gly Lys Gly Ala Ser Ser His Cys Leu 
420 425 430 
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AAG GTA AAC TCC TTT CTA AGA AGG ACC TAC TTC ACT AAT CCT CTT GGG 1463 
Lys Val Asn Ser Phe Leu Arg Arg Thr Tyx Phe Thr Asn Pro Leu Gly 
435 440 445 

GAC AAA GTG TTT ATG AAG CAA AGA GTA ATA ATG CAG GAT GAA TAT GAC 1511 
Asp Lys Val Phe Met Lys Gin Arg Val lie Met Gin Asp Glu Tyr Asp 
450 455 460 465 

ATT GTT CAC TTT GCG AAT CTC TCA CAA CAC CTT GGG ATT AAG ATG AAG 1559 
He Val His Phe Ala Asn Leu Ser Gin His Leu Gly He Lys Met Lys 
470 475 480 

TTA GGA AAG TTC AGC CCA TAT TTA CCA CAT GGT CGA CAC TCT CAC TTA 1607 
Leu Gly Lys Phe Ser Pro Tyr Leu Pro His Gly Arg His Ser His Leu 
485 490 495 

TAC GTA GAC ATG ATT GAG TTG GCC ACA GGA AGA AGA AAG ATG CCA TCC 1655 
Tyr Val Asp Met He Glu Leu Ala Thr Gly Arg Arg Lys Met Pro Ser 
500 505 510 

TCT GTG TGC AGT GCA GAT TGT AGT CCT GGA TTC AGA AGA TTA TGG AAG 1703 
Ser Val Cys Ser Ala Asp Cys Ser Pro Gly Phe Arg Arg Leu Trp Lys 
515 520 525 

GAG GGA ATG GCA GCC TGC TGT TTT GTT TGC AGC CCC TGC CCT GAA AAT 1751 
Glu Gly Met Ala Ala Cys Cys Phe Val Cys Ser Pro Cys Pro Glu Asn 
530 535 540 545 

GAA ATT TCT AAT GAG ACA AAT ATG GAT CAA TGC GTG AAT TGT CCA GAA 1799 
Glu He Ser Asn Glu Thr Asn Met Asp Gin Cys Val Asn Cys Pro Glu 
550 555 560 

TAC CAA TAT GCC AAC ACA GAA CAG AAC AAA TGT ATT CAG AAA GGT GTC 1847 
Tyr Gin Tyr Ala Asn Thr Glu Gin Asn Lys Cys He Gin Lys Gly Val 
565 570 575 

ACC TTC CTA AGC TAT GAA GAC CCC TTG GGG ATG GCA CTT GCC TTA ATG 1895 
Thr Phe Leu Ser Tyr Glu Asp Pro Leu Gly Met Ala Leu Ala Leu Met 
580 585 590 

GCC TTC TGC TTC TCT GCA TTC ACA GCT GTG GTA CTT TGT GTC TTT GTG 1943 
Ala Phe Cys Phe Ser Ala Phe Thr Ala Val Val Leu Cys Val Phe Val 
595 600 605 

AAG CAC CAT GAC ACT CCT ATT GTG AAG GCC AAT AAC AGA AGC CTC AGC 1991 
Lys His His Asp Thr Pro He Val Lys Ala Asn Asn Arg Ser Leu Ser 
610 615 620 625 

TAT CTA TTA CTC ATG TCA CTC ATG TTC TGT TTT CTG TGC TCC TTT TTC 2039 
Tyr Leu Leu Leu Met Ser Leu Met Phe Cys Phe Leu Cys Ser Phe Phe 
630 635 640 

TTC ATT GGC CTT CCA AAC AAA GTC ATC TGT GTC TTA CAG CAA ATC ACA 2087 
Phe He Gly Leu Pro Asn Lys Val He Cys Val Leu Gin Gin He Thr 
645 650 655 

TTT GGA ATT GTA TTC ACT GTG GCT GTT TCC ACA GTT CTG GCC AAA ACA 2135 
Phe Gly He Val Phe Thr Val Ala Val Ser Thr Val Leu Ala Lys Thr 
660 665 4 670 

GTC ACT GTG GTT CTA GCT TTC AAA GTC ACA GTC CCA GGA AGA AGA TTG 2183 
Val Thr Val Val Leu Ala Phe Lys Val Thr Val Pro Gly Arg Arg Leu 
675 680 685 

AGA TAC TTC CTT GTA TCA GGG ACA CTA AAC TAC ATT ATT CCT ATA TGT. 2231 
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Arg Tyr Phe Leu Val Ser Gly Thr Leu Asn Tyr lie He Pro He Cys 
690 695 700 705 

TCC CTA CTC CAA TGT GTT CTG TGT GCA ATC TGG CTA GCA GTC TCT. CCT 2279 
Ser Leu Leu Gin Cys Val Leu Cys Ala He Trp Leu Ala Val Ser Pro 
710 715 720 

CCC TTT GTT GAT ATT GAT GAA CAC TCT GAG CAT GGC CAC ATC ATC ATT 2327 
Pro Phe Val Asp He Asp Glu His Ser Gin His Gly His He He He 
725 730 735 

GTG TGC AAC AAG GGC TCA GTT ACT GCA TTC TAC TGT GTC CTT GGA TAC 2375 
Val Cys Asn Lys Gly Ser Val Thr Ala Phe Tyr Cys Val Leu Gly Tyr 
740 745 750 

TTG GCC TGC CTG GCA CTG GGA AGC TTC ACT TTG GCT TTC TTG GCC AAG 2423 
Leu Ala Cys Leu Ala Leu Gly Ser Phe Thr Leu Ala Phe Leu Ala Lys 
755 760 765 

AAT CTG CCT GAT GCA TTC AAT GAA GCC AAG TTC TTG ACC TTC AGC ATG 2471 
Asn Leu Pro Asp Ala Phe Asn Glu Ala Lys Phe Leu Thr Phe Ser Met 
770 775 780 785 

CTA GTG TTC TGC AGT GTC TGG GTC ACC TTC CTC CCT GTG TAC CAT AGC 2519 
Leu Val Phe Cys Ser Val Trp Val Thr Phe Leu Pro Val Tyr His Ser 
790 795 800 

ACA AAG GGC AAA CAC ATG GTT GCT GTG GAG ATC TTC TCT ATC TTG GCA 2567 
Thr Lys Gly Lys His Met Val Ala Val Glu He Phe Ser He Leu Ala 
805 810 815 

TCC AGT GCA GGG ATG CTT GGA TGT ATT TTT GTA CCC AAG ATT TAT ATC 2615 
Ser Ser Ala Gly Met Leu Gly Cys He Phe Val Pro Lys He Tyr He 
820 825 830 

ATT TTA ATG AGA CCA GAG AGA AAT TCT ACC CAA AAG ATC AGA GAA AAA 2663 
He Leu Met Arg Pro Glu Arg Asn Ser Thr Gin Lys He Arg Glu Lys 
835 840 845 

TCA TAT TTT TGAACAAATA TTTAGGAATT CTGTCAAATG TAAAGTTGGT ACATAACCA 2721 

Ser Tyr Phe 

850 

CCAAATATTG GGTTATAGTG CATGTGTCTA GTTTTAGAAT CACTCTCACT GGTTGCTCTA 2781 

GTGATAAAAG GAAGTATCAT ATCTACTGAA CTTCCGTACA GTGTCCATAA AATCTTGCAC 2841 

TCATTCACTT TCTTCATTTT CTCTCAGAGA ACTAAACTCT CTAATTATTA CAATTTTATT 2901 

CTTCGTTTTG AATTTCATGG AGATTGCCCT CTGGTAACTT CCAAAAAAAC GTTGATAAGG 2961 

CAGTTTAATC CACCACTTTG TGTAGAAAAA ATGAGATCTA GGACAGACAG GGTTACACAT 3021 

AGAAAC CATC TACCAAATCA AATAATCAAT GAGAAACACA GACTAACTAA ATAATCAGCA 3081 

AAGTTGAAAT CAGAACTTAT TTTCTGATTT CCAGTAAGAG CACACACAGA AGAAAATACT 3141 

GACTTTTTTT TTCTTCTGTT CTTCAAGCTA CTGGCCAATA ATCTAAGGAG GAAATGTTCC 3201 

TTTTCTGCTG TCAAATACAA ATATATTATA TCCAACAATG ATCAGAAGCC CAGGGATTCT 3261 

GTGGCTGAAT TGGGAATATT TGGAAGAAGC TGAGGAGGAG GGTGACCAGC ATTCTCAACA 3321 

AACCTGGACA AGCAAGATCT CTCAGACACT GAGCCTCTAA CCAGAGATCA TACACAAGCT 3381 

GATGTGAAGC CCCCAACAAA TATGCACCAT AAGACTGCCT GGTCTAGCAT CAGTGGGAGA 3441 

CACACCTAAC CCCAGAGAGA CTTAAGTCCC CAGGGATTGG GAAGTGCTGG GCATTGGGGA 3501 

TGTAGGGATA TCATCTTGGA GATGGCAGAG GAGTTGTTAG ATGAGGAAGA GTCAGTGGGG 3561 

CAAACCAGGA GGGGGATAAC TACTAGATTG TAACAAAAAT ATTGAGTAAT AATAAATTAA 3621 

AAAA 3625 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 852 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 



net 


xrxie 


xxe 


jrne 


Ma^ 


v»iy 


vai 


pne 


Phe 


Leu 


Leu 


Asn 


lie 


Thr 


Leu 


Leu 


1 








5 










10 










15 




Mot* 


Ala 


Asn 


XTXIG 


Tl A 


Asp 


xrro 


Arg 


Cys 


Phe 


Trp 


Arg 


xxe 


Asn 


T All 

Lieu 


Asp 


















.25 










30 






f2l ii 


Tl A 


TVi v 


Asp 


rtl'ii 
ulu 


Tyr 


Leu 


f»l i» 

Giy 


T /-»■« % 

Leu 


Ser 


Cys 


Ala 


Pne 


lie 


Leu 


Ala 






O 5 










40 










45 








Al a 
4-ix.ci 


V Ct X 


m n 

ulxx 




Pro 


Tl- 
116 


r^i it 

vjIU 


Lys 


Asp 


Tyr 


Phe 


Asn 


Thr 


Thr 


Leu 


Asn 




c ft 










55 










60 










OV» a 


Leu 


Lys 


1 XXi 


X IXX7 


Lys 


Asn 


HIS 


Lys 


Tyr Ala 


Leu 


Ala 


Leu 


Val 


Pne 


o 5 










70 










75 










80 


Ala 


net 


Asp 


fal ii 


Tl A 

ne 


Asn 


Arg 


Tyr 


Pro 


Asp 


Leu 


Leu 


Pro 


Asn 


Met 


Ser 










P. CTi 










90 










oc 




Leu 


Tl A 

xxe 


Ti- 
ne 


Arg 


Tyr 


Ser 


iieu 


Giy 


His 


Cys Asp 


Gly 


Lys 


Thr 


Val 


Thr 








100 










105 










110 






Pro 


rpV, -w- 

XixxT 


Pro 


Tyr 


Leu 


irne 


£11 S 


Arg 


T * m 

Lr/S 


Lys 


Gin 


Ser 


Pro 


He 


Pro 


Asn 






115 










120 










125 








Tyr 




Cys 


Asn 


ulu 


bill 


Ser 


Met 


Cys 


Ser 


Phe 


Leu 


Leu 


Ser 


Gly 


Pro 




*i ^ ft 
X J U 










i *a c 
135 










140 








Asn 


A17 P 


7\ on 

AAL/ 


m xi 


OCX 


Leu 


Ser 


DVi a 


Trp 


Lys 


Tyr 


Leu 


Asp 


Ser 


Phe 


Leu 












T C ft 

150 










155 










160 


Ser 


xr x, u 


Arg 


Tl - 
116 


Leu 


will 


Leu 


Ser 


Tyr 


Gly Ser 


irne 


Ser 


Ser 


He 


pne 










ID 2> 










170 










175 




Ser 


ASp 


Asp 


UlU 




Ty* 


Pro 


Tyr 


Leu 


Tyr 


Gin 


Met 


Ala 


Pro 


Lys 


Asp 








inn 

lOU 










IOC 

185 










190 






Thr 


Ser 


Leu 


Ala 


iieu 


AT a 
Ala 


Ma*- 


Val 


Ser 


Phe 


He 


Leu 


Tyr 


Leu 


Lys 


Trp 
















o n o 

200 










205 








Asn 


Trp 


lie 


ul jr 


Leu 


val 


ne 


fro 


Asp 


Asp Asp 


r<l i~i 

uin 


Gly Asn Gin 


pne 




Ol ft 










215 










220 










Leu 


Leu 


JlU 


Leu 


Lys 


Lys 


r»l n 


Ser 


ulu 


Asn 


Lys 


GXU 


He 


Cys 


Phe 


Ala 












230 










235 










240 


13 Via 


val 


Lys 


Ma** 


Tl A 

lie 


Ser 


lT fi T 

vai 


Asp 


G1U 


Val 


Ser 


Pne 


Pro 


Gin 


Lys 


Thr 




















250 










255 




Glu 


lie 


Asn 


xyxv 


Lys 


Gin 


He 


Val 


T.vta 


Ser 


Leu 


X XIX. 


Asn 


Val 


He 


Tl A 

lie 








OCA 




















270 






lie 


xyt 


Glv 


Glu 


X XIX 


X^XT 


iioll 


r>H a 


Tl A 

lie 


Asp 


Leu 


Tl A 


Phe 


Arg 


Met 


Trp 






one 




















285 








Glu 


Pro 


Pro 


lie 


Leu 


Gin 


Arq 


He 


Trp 


He 


Thr 


Thr 


Lys 


Gin 


Leu 


Asn 




290 










295 










300 










Phe 


Pro 


Thr 


Ser 


Lys 


Thr 


Asp 


He 


Ser 


His 


Asp 


Thr 


Phe 


Tyr Gly 


Ser 


305 










310 










315 










320 


Leu 


Thr 


Phe 


Leu 


Pro 


His 


His 


Gly 


Glu 


He 


Ser 


Gly 


Phe 


Lys 


Asn 


Phe 










325 










330 










335 




Val 


Gin 


Thr 


Trp 


Phe 


His 


Leu 


Arg 


Asn 


Thr 


Asp 


Leu 


Cys 


Leu 


Val 


Met 








340 










345 










350 






Pro 


Glu 


Trp 


Lys 


Tyr 


He 


Asn 


Ser 


Glu 


Asp 


Ser 


Ala 


Ser 


Asn 


Cys 


Lys 






355 










360 










365 








lie 


Leu 


Lys 


Asn 


Ser 


Ser 


Ser 


Asp 


Ala 


Ser 


Phe 


Asp 


Trp 


Leu 


Met 


Glu 




370 










375 










380 










Glu 


Lys 


Leu 


Asp 


Met 


Ala 


Phe 


Ser 


Glu 


Asn 


Ser 


His 


Asn 


He 


Tyr 


Asn 


385 










390 










395 








400 


Ala 


Val 


His 


Ala 


lie 


Ala 


His 


Ala 


Leu 


His 


Glu 


Met 


Asn 


Leu 


Gin 


Gin 










405 










410 










415 




Ala 


Asp 


Asn 


Gin 


Ala 


He 


Asp 


Asn 


Gly 


Lys 


Gly 


Ala 


Ser 


Ser 


His 


Cys 








420 










425 










430 






Leu 


Lys 


Val 


Asn 


Ser 


Phe 


Leu 


Arg 


Arg 


Thr 


Tyr 


Phe 


Thr 


Asn 


Pro 


Leu 






435 










440 










445 








Gly 


Asp 


Lys 


Val 


Phe 


Met 


Lys 


Gin 


Arg 


Val 


He 


Met 


Gin Asp Glu 


Tyr 
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450 










455 










460 










Asp 


He 


Val 


His 


Phe 


Ala 


Asn 


Leu 


Ser 


Gin 


His 


Leu Gly 


He 


Lys 


Met 


465 










470 










475 








480 


Lys 


Leu 


Gly Lys 


Phe 


Ser 


Pro 


Tyr 


Leu 


Pro 


His 


Gly Arg 


His 


Ser 


His 










485 










490 










495 




Leu 


Tyr 


Val 


Asp Met 


He 


Glu 


Leu 


Ala Thr Gly 


Arg Arg 


Lys 


Met 


Pro 








500 




















510 






Ser 


Ser 


Val 


Cys 


Ser 


Ala 


Asp 


Cys 


Cor 


pro 


biy 




Arg 


Arg 


Leu 








515 










520 










525 






Lys 


Glu 


Gly Met Ala 


Ala 


Cys 


Cys 




Val 


Cys 


Ser 


Pro 


Cys 


Pro 


Glu 




530 










535 










C A f\ 

540 








Asn 


Glu 


He 


Ser 


Asn 


Glu 


Thr 


Asn 






bin 


Cys 


Val 


Asn 


Cvs 
u / a 


Pro 


545 










550 


















Glu 


Tyr 


Gin 


Tyr Ala 


Asn 


Thr 


Glu 




Asn 


Lys 


Cys 


He 


Gin 


Lys 


Gly 










565 










3 / u 










575 


Val 


Thr 


Phe 


Leu 


Ser 


Tyr 


Glu Asp 


Pro 


Leu 


v»±y 


Met 


Ala 


Leu 


Ala 


Leu 








580 










CDC 
JOS 










590 






Met 


Ala 


Phe 


Cys 


Phe 


Ser 


Ala 


Phe 


i> 111. 




Val 


val 


Leu 


Cys 


Val 


Phe 
















600 










605 






Val 


Lys 


His 


His 


Asp 


Thr 


Pro 


He 


veil. 


Lys 


Ala 


Asn 


Asn Ara 


Ser 


Leu 




610 










615 










620 










Ser 


Tyr 


Leu 


Leu 


Leu 


Met 


Ser 


Leu 


rleu 


Jrxie 


Cys 


Pne 


Leu 


Cys 


Ser 


Phe 


625 










630 










63 5 








640 


Phe 


Phe 


He 


Gly Leu 


Pro 


Asn 


Lys 


Va 1 


IXC 


Cys 


vai 


Leu 


Gin 


Gin 


He 










645 










03U 










655 




Thr 


Phe 


Glv 


He 


Val 


Phe 


Thr 


Val 




Val 


OCX 


inr 


Val 


Leu 


Ala 


T ,VR 








660 




















670 




Thr 


Val 


Thr 


Val 


Val 


Leu 


Ala 


Phe 


Lys 


VcLX 


T*Vt V 

lnr 


Va *1 

vai 


Pro Gly Arg 


i*rg 






C7C 
© /D 










680 










685 






Leu 


Arg 


Tyr 


Phe 


Leu 


Val 


Ser Gly 


J. Xi.X 


Leu 


Asn 


Tyr 


He 


He 


Pro 


He 




690 










695 










700 










Cys 


Ser 


Leu 


Leu 


Gin 


Cys 


Val 


Leu 


Cys 


Ala 


He 


Trp 


Leu 


Ala 


Val 


Ser 


705 










710 










715 








720 


Pro 


Pro 


Phe 


Val 


Asp 


He Asp Glu 


His 


Ser 


Gin 


His 


Gly His 


He 


He 










725 










730 










735 




He 


Val 


Cys 


Asn 


Lys 


Gly Ser Val 


Thr 


Ala 


Phe 


Tyr 


Cys 


Val 


Leu 


Glv 








740 










745 










750 




Tyr 


Leu 


Ala 


Cys 


Leu 


Ala 


Leu Gly 


Ser 


Phe 


Thr 


Leu 


Ala 


Phe 


Leu 


Ala 






755 










760 










765 








Lys 


Asn 


Leu 


Pro 


Asp 


Ala 


Phe 


Asn 


Glu 


Ala 


Lys 


Phe 


Leu 


Thr 


Phe 


Ser 




770 










775 








780 










Met 


Leu 


Val 


Phe 


Cys 


Ser 


Val 


Trp 


Val 


Thr 


Phe 


Leu 


Pro 


Val 


Tyr 


His 


785 










790 










795 








800 


Ser 


Thr 


Lys 


Gly Lys 


His 


Met 


Val 


Ala 


Val 


Glu 


He 


Phe 


Ser 


He 


Leu 










805 










810 










815 




Ala 


Ser 


Ser 


Ala 


Gly 


Met 


Leu 


Gly 


Cys 


He 


Phe 


Val 


Pro 


Lys 


He 


Tyr 








820 










825 










830 




He 


He 


Leu 


Met 


Arg 


Pro Glu Arg 


Asn 


Ser 


Thr 


Gin 


Lys 


He Arg 


Glu 






835 










840 










845 








Lys 


Ser 
850 


Tyr 


Phe 



























(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3125 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
( ix) FEATURE : 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 1...2169 
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(D) OTHER INFORMATION: VR5 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

ATC TGT AAT GAA GAG AGT ATG TGT TCA TTT CTG CTT TCA GGA CCC AAT 48 
lie Cys Asn Glu Glu Ser Met Cys Ser Phe Leu Leu Ser Gly Pro Asn 
15 10 15 

TGG GAT GAA TCT TTA AGT TTC TGG AAG TAC CTG GAC AGC TTC TTA TCT 96 
Trp Asp Glu Ser Leu Ser Phe Trp Lys Tyr Leu Asp Ser Phe Leu Ser 
20 25 30 

CCA CAT ATC CTT CAG CTT TCC TAT GGA TCT TTC AGT TCC ATC TTC AGT 144 
Pro His He Leu Gin Leu Ser Tyr Gly Ser Phe Ser Ser He Phe Ser 
35 40 45 

GAT GAT GAA CAA TAT CCC TAT CTC TAT CAG ATG GCC CCA AAG GAC ACA 192 
Asp Asp Glu Gin Tyr Pro Tyr Leu Tyr Gin Met Ala Pro Lys Asp Thr 

50 55 60 

TCT CTA GCA TTG GCA ATG GTC TCC TTC ATA CTT TAT TTG AAA TGG AAT 24 0 

Ser Leu Ala Leu Ala Met Val Ser Phe He Leu Tyr Leu Lys Trp Asn 
65 70 75 80 

TGG ATT GGC CTT GTC ATC CCA GAT GAC GAT CAA GGA AAC CAA TTT CTT 288 
Trp He Gly Leu Val He Pro Asp Asp Asp Gin Gly Asn Gin Phe Leu 
85 90 95 

TTA GAG TTG AAG AAA CAG AGT GAA AAC AAA GAA ATT TGC TTT GCC TTT 336 
Leu Glu Leu Lys Lys Gin Ser Glu Asn Lys Glu He Cys Phe Ala Phe 
100 105 no 

GTG AAA ATG ATA TCT GTT GAT GAA GTT TCA TTT CCA CAA AAA ACT GAA 3 84 

Val Lys Met He Ser Val Asp Glu Val Ser Phe Pro Gin Lys Thr Glu 
115 120 125 

ATA TAC TAC AAA CAA ATT GTG AAG TCA TTA ACA AAT GTT ATT ATC ATT 432 
He Tyr Tyr Lys Gin He Val Lys Ser Leu Thr Asn Val He He He 
130 135 140 

TAT GGA GAA ACA TAT AAT TTC ATT GAT TTG ATC TTC AGA ATG TGG GAA 480 
Tyr Gly Glu Thr Tyr Asn Phe He Asp Leu He Phe Arg Met Trp Glu 
145 150 155 160 

CCT CCC ATT TTA CAG AGA ATA TGG ATC ACC ACA AAA CAA TTG AAT TTC 528 
Pro Pro He Leu Gin Arg He Trp He Thr Thr Lys Gin Leu Asn Phe 
165 170 175 

CCT ACC AGT AAG ACA GAC ATA AGT CAT GAC ACA TTC TAT GGA TCA CTT 576 
Pro Thr Ser Lys Thr Asp He Ser His Asp Thr Phe Tyr Gly Ser Leu 
180 185 190 

ACT TTT CTA CCC CAC CAT GGT GAG ATT TCT GGC TTT AAA AAT TTT GTA 624 
Thr Phe Leu Pro His His Gly Glu He Ser Gly Phe Lys Asn Phe Val 
195 200 205 

CAG ACA TGG TTC CAT CTC AGA AAC ACA GAT TTA TAT CTA GTA ATG CCA 672 
Gin Thr Trp Phe His Leu Arg Asn Thr Asp Leu Tyr Leu Val Met Pro 
210 215 220 

GAG TGG AAA TAT ATT AAC TCT GAA GAC TCA GCA TCT AAT TGT AAA ATA 720 
Glu Trp. Lys Tyr He Asn Ser Glu Asp Ser Ala Ser Asn Cys Lys He 
225 230 235 240 
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CTG AAG AAC AGT TCA TCT GAT GCC TCA TTT GAT TGG CTA ATG GAA CAG 768 
Leu Lys Asn Ser Ser Ser Asp Ala Ser Phe Asp Trp Leu Met Glu Gin 
245 250 255 

AAG CTT GAC ATG GCC TTT AGT GAT AAT AGT CAT AAC ATA TAT AAT GTT 816 
Lys Leu Asp Met Ala Phe Ser Asp Asn Ser His Asn lie Tyr Asn Val 
260 265 270 

GTG CAT GCC ATA GCC CAT GCC CTC CAT GAG ATG AAT CTG CAA CAG GCT 864 
Val His Ala lie Ala His Ala Leu His Glu Met Asn Leu Gin Gin Ala 
275 280 285 

GAT AAT CAG GCA ATA GAT AAT GGA AAA GGA GCC AGT TCT CAC TGC TTG 912 
Asp Asn Gin Ala lie Asp Asn Gly Lys Gly Ala Ser Ser His Cys Leu 
290 295 300 

AAG GTA AAC TCC TTT CTA AGA AGG ACC TAC TTC ACT AAT CCT CTT GGG 960 
Lys Val Asn Ser Phe Leu Arg Arg Thr Tyr Phe Thr Asn Pro Leu Gly 
305 310 315 320 

GAC AAA GTG TTT ATG AAG CAA AGA GTA ATA ATG CAG GAT GAA TAT GAC 1008 
Asp Lys Val Phe Met Lys Gin Arg Val lie Met Gin Asp Glu Tyr Asp 
325 330 335 

ATT GTT CAC TTT GCG AAT CTC TCA CAA CAC CTT GGG ATT AAG ATG AAG 1056 
lie Val His Phe Ala Asn Leu Ser Gin His Leu Gly lie Lys Met Lys 
340 345 350 

TTA GGA AAG TTC AGC CCA TAT TTA CCA CAT GGT CGA CAC TCT CAC TTA 1104 
Leu Gly Lys Phe Ser Pro Tyr Leu Pro His Gly Arg His Ser His Leu 
355 360 365 

TAC GTA GAC ATG ATT GAG TTG GCC ACA GGA AGA AGA AAG ATG CCA TCC 1152 
Tyr Val Asp Met lie Glu Leu Ala Thr Gly Arg Arg Lys Met Pro Ser 
370 375 380 

TCT GTG TGC AGT GCA GAT TGT AGT CCT GGA TTC AGA AGA TTA TGG AAG 1200 
Ser Val Cys Ser Ala Asp Cys Ser Pro Gly Phe Arg Arg Leu Trp Lys 
385 390 395 400 

GAG GGA ATG GCA GCC TGC TGT TTT GTT TGC AGC CCC TGC CCT GAA AAT 1248 
Glu Gly Met Ala Ala Cys Cys Phe Val Cys Ser Pro Cys Pro Glu Asn 
405 410 415 

GAA ATT TCT AAT GAG ACA AAT ATG GAT CAA TGC GTG AAT TGT CCA GAA 1296 
Glu lie Ser Asn Glu Thr Asn Met Asp Gin Cys Val Asn Cys Pro Glu 
420 425 430 

TAC CAA TAT GCC AAC ACA GAA CAG AAC AAA TGT ATT CAG AAA GGT GTC 1344 
Tyr Gin Tyr Ala Asn Thr Glu Gin Asn Lys Cys lie Gin Lys Gly Val 
435 440 445 

ACC TTC CTA AGC TAT GAA GAC CCC TTG GGG ATG GCA CTT GCC TTA ATG 13 92 
Thr Phe Leu Ser Tyr Glu Asp Pro Leu Gly Met Ala Leu Ala Leu Met 
450 455 460 

GCC TTC TGC TTC TCT GCA TTC ACA GCT GTG GTA CTT TGT GTC TTT GTG 1440 
Ala Phe Cys Phe Ser Ala Phe Thr Ala Val Val Leu Cys Val Phe Val 
465 470 475 480 

AAG CAC CAT GAC ACT CCT ATT GTG AAG GCC AAT AAC AGA AGC CTC AGC 1488 
Lys His His Asp Thr Pro lie Val Lys Ala Asn Asn Arg Ser Leu Ser 
485 490 495 



TAT CTA TTA CTC ATG TCA CTC ATG TTC TGT TTT CTG TGC TCC TTT TTC 



1536 
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Tyr Leu Leu Leu Met Ser Leu Met Phe Cys Phe Leu Cys Ser Phe Phe 
500 505 510 

TTC ATT GGC CTT CCA AAC AAA GTC ATC TGT GTC TTA CAG CAG ATC ACA 1584 
Phe He Gly Leu Pro Asn Lys Val He Cys Val Leu Gin Gin He Thr 
515 520 525 

TTT GGA ATT GTA TIT ACT GTA GCT GTT TCC ACA GTT CTG GCC AAA ACA 1632 
Phe Gly He Val Phe Thr Val Ala Val Ser Thr Val Leu Ala Lys Thr 
530 535 540 

GTC ACT GTG GTT CTA GCT TTC AAA GTC ACA GAC CCA GGA AGA AGA TTG 1680 
Val Thr Val Val Leu Ala Phe Lys Val Thr Asp Pro Gly Arg Arg Leu 
545 550 555 560 

AGA TAC TTC CTT GTA TCA GGG ACA CTA AAC TAC ATT ATT CCT ATA TGT 1728 
Arg Tyr Phe Leu Val Ser Gly Thr Leu Asn Tyr He He Pro He Cys 
565 570 575 

TCC CTA CTC CAA TGT GTT CTG TGT GCA ATC TGG CTA GCA GTC TCT CCT 1776 
Ser Leu Leu Gin Cys Val Leu Cys Ala He Trp Leu Ala Val Ser Pro 
580 585 590 

CCC TTT GTT GAT ATT GAT GAA CAC TCT CAG CAT GGC CAC ATC ATC ATT 1824 
Pro Phe Val Asp He Asp Glu His Ser Gin His Gly His He He He 
595 600 605 

GTG TGC AAC AAG GGC TCA GTT ACT GCA TTC TAC TGT GTC CTT GGA TAC 1872 
Val Cys Asn Lys Gly Ser Val Thr Ala Phe Tyr Cys Val Leu Gly Tyr 
610 615 620 

TTG GCC TGC CTG GCA CTG GGA AGC TTC ACT TTG GCT TTC TTG GCC AAG 1920 
Leu Ala Cys Leu Ala Leu Gly Ser Phe Thr Leu Ala Phe Leu Ala Lys 
625 630 635 640 

AAT CTG CCT GAT GCA TTC AAT GAA GCC AAG TTC TTG ACC TTC AGC ATG 1968 
Asn Leu Pro Asp Ala Phe Asn Glu Ala Lys Phe Leu Thr Phe Ser Met 
645 650 655 

CTA GTG TTC TGC AGT GTC TGG GTC ACC TTC CTC CCT GTG TAC CAT AGC 2016 
Leu Val Phe Cys Ser Val Trp Val Thr Phe Leu Pro Val Tyr His Ser 
660 665 670 

ACA AAG GGC AAA CAC ATG GTT GCT GTG GAG ATC TTC TCC ATC TTG GCA 2 064 
Thr Lys Gly Lys His Met Val Ala Val Glu He Phe Ser He Leu Ala 
675 680 685 

TCC AGT GCA GGG ATG CTT GAA TGT ATT TTT GTA CCC AAG ATT TAT ATC 2112 
Ser Ser Ala Gly Met Leu Glu Cys He Phe Val Pro Lys He Tyr He 
690 695 700 

ATT TTA ATG AGA CCA GAG AGA AAT TCT ACC CAA AAG ATC AGG GAA AAA 2160 
He Leu Met Arg Pro Glu Arg Asn Ser Thr Gin Lys He Arg Glu Lys 
705 710 715 720 



TCA TAT TTC TGAACAAATA TTTAGGAATT CTGTCAAATG TAAAGTTGGT ACATAACCA 2218 
Ser Tyr Phe 



CCAAATATTG GGTTATAGTG CATGTGTCTA GTTTTAGAAT CACTCTCACT GGTTGCTCTA 2278 

GTGATAAAAG GAAGTATCAT ATCTACTGAA CTTATGTACA GTGTCCATAA AATCTTGCAC 2338 

TCATT CAC TT TCTTCATTTT CTCTCAGAGA ACTAAACTCT CTAATTATTA CAATTTTATT 2398 

CTTCGTTTTG ATTTCATGGA GATTGCCCTC TGGTAACTTC CAAAAACCGT T GAT AAG GCA 2458 

G TTT AAT CCA CCACTTTGTG TAGAAAAAAT GAGATCTAGG ACAGACAGGG TTACACATAG 2518 

AAACCATCTA CCAAATCAAA TAATCAATGA GAAACACAGA CTAACTAAAT AATCAGCAAA 2578 
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GTTGAA ATCA GAATTATTTT CTGATTTCCA GTAAGAGCAC ACACAGAAGA AAATACTGAC 2638 

TTTTTTTTTC TTCTGTTCTT CAAG CTACTG GCCAATAATC TAAGGAGGAA ATGTTCCTTT 2698 

TCTGCTGTCA AATACAAATA TATTATATCC AACAATGATC AGAAGCCCAG GGATTCTGTG 2758 

GCTGAATTGG GAATATTTGG AAGAAGCTGA GGAGGAGGGT GACCAGCATT CTCAACAAAC 2818 

CTGGACAAGC AAGATCTCTC AGACACTGAG CCTCTAACCA GAGATCATAC ACAAGCTGAT 2878 

GTGAAGCCCC CAACAAATAT GCACCATAAG ACTGCCTGGT CTAGCATCAG TGGGAGACAC 2938 

ACCTAACCCC AGAGAGACTT AAGTCCCCAG GGATTGGGAA GTGCTGGGCA TTGAGGATGT 2998 

AGGGATATCA TCTTTGAGAT GGCAGAGGAG TTGTTAGATG AGGAAGAGTC AGGGGGGCAA 3 058 

ACCAGGAAGG GGATAACTAC TAGATTGTAA CAAAAATATT GAGTAATAAT AAATTAAAAA 3118 

ATGAAAT 3125 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 723 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



lie 


Cys 


Asn 




w X U 


Ser 


Met- 
ric u. 


Cys 


Ser 




Leu 


Leu 


Ser 


Gly 


Pro 


Asn 


X 








c 
a 










10 








15 




± tp 


Asp 


Glu 


Gov 


Leu 






Trp 


Lys 


Tyr 


Leu 


Asp 


Ser 


Phe 


Leu 


Ser 




























30 






Pro 


His 


He 


Leu 


Gin 


Leu 


Ser 


Tvr 


Glv 


Ser 


Phe 


Ser 


Ser 


He 


Phe 


Ser 






35 










40 










45 








Asp 


Asp 


Glu 


Gin 


Tyr 


Pro 


Tyr 


Leu 


Tyr 


Gin 


Met 


Ala 


Pro 


Lys 


Asp 


Thr 




50 










55 










60 










Ser 


Leu 


Ala 


Leu 


Ala 


Met 


Val 


Ser 


Phe 


He 


Leu 


Tyr 


Leu 


Lys 


Trp 


Asn 


65 










70 










75 




80 


Trp 


He 


Gly 


Leu 


Val 


He 


Pro 


Asp 


Asp 


Asp 


Gin 


Gly Asn Gin Phe 


Leu 










85 










90 










95 




Leu 


Glu 


Leu 


Lys 


Lys 


Gin 


Ser 


Glu 


Asn 


Lys 


Glu 


He 


Cys 


Phe 


Ala 


Phe 








100 










105 










110 






Val 


Lys 


Met 


He 


Ser 


Val 


Asp 


Glu 


Val 


Ser 


Phe 


Pro Gin 


Lys 


Thr 


Glu 






115 










120 










125 








He 


Tyr 


Tyr 


Lys 


Gin 


He 


Val 


Lys 


Ser 


Leu 


Thr 


Asn 


Val 


He 


He 


He 




130 










135 










140 










Tyr 


Gly 


Glu 


Thr 


Tyr 


Asn 


Phe 


He 


Asp 


Leu 


He 


Phe 


Arg 


Met 


Trp 


Glu 


145 










150 










155 










160 


Pro 


Pro 


He 


Leu 


Gin 


Arg 


He 


Trp 


He 


Thr 


Thr 


Lys 


Gin 


Leu 


Asn 


Phe 










165 










170 










175 




Pro 


Thr 


Ser 


Lys 


Thr 


Asp 


He 


Ser 


His 


Asp 


Thr 


Phe 


Tyr Gly 


Ser 


Leu 








180 










185 










190 






Thr 


Phe 


Leu 


Pro 


His 


His 


Gly 


Glu 


He 


Ser 


Gly 


Phe 


Lys 


Asn 


Phe 


Val 






195 










200 










205 








Gin 


Thr 


Trp 


Phe 


His 


Leu 


Arg 


Asn 


Thr 


Asp 


Leu 


Tyr 


Leu 


Val 


Met 


Pro 




210 










215 










220 










Glu 


Trp 


Lys 


Tyr 


He 


Asn 


Ser 


Glu 


Asp 


Ser 


Ala 


Ser 


Asn 


Cys 


Lys 


He 


225 










230 










235 










240 


Leu 


Lys 


Asn 


Ser 


Ser 


Ser 


Asp 


Ala 


Ser 


Phe 


Asp 


Trp 


Leu 


Met 


Glu 


Gin 










245 










250 










255 




Lys 


Leu 


Asp 


Met 


Ala 


Phe 


Ser 


Asp 


Asn 


Ser 


His 


Asn 


He 


Tyr 


Asn 


Val 








260 










265 










270 






Val 


His 


Ala 


He 


Ala 


His 


Ala 


Leu 


His 


Glu 


Met 


Asn 


Leu 


Gin 


Gin 


Ala 






275 










280 










285 








Asp 


Asn 


Gin 


Ala 


He 


Asp 


Asn 


Gly 


Lys 


Gly 


Ala 


Ser 


Ser 


His 


Cys 


Leu 




290 










295 










300 










Lys 


Val 


Asn 


Ser 


Phe 


Leu 


Arg 


Arg 


Thr 


Tyr 


Phe 


Thr 


Asn 


Pro 


Leu 


Gly 


305 










310 










315 










320 


Asp 


Lys 


Val 


Phe 


Met 


Lys 


Gin 


Arg 


Val 


He 


Met 


Gin 


Asp 


Glu 


Tyr 


Asp 
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lie 


Val 


His 


Phe 








340 


Leu. 


Glv 


Lys 


Phe 






355 




Tvr 


Val 


Asp Met 




370 






Ser 


Val 


Cys 


Ser 


385 








Glu 


Glv 


Met 


Ala 


Glu 


He 


Ser 


Asn 








420 




Gin 


Tyr 


Ala 






435 




Thr 


Phe 


Leu 


Ser 




450 






Ala 


Phe 


Cys 


Phe 


465 








Lvs 


His 


His 


Asp 




Leu 


Leu 


Leu 








500 


Phe 


He 


Gly Leu 






515 




Phe 


Glv 


He 


Val 




530 






Val 


Thr 


Val 


Val 


545 








At*ct 


Tvr 

j 


Phe 


Leu 


Ser 


Leu 


Leu 


Gin 








580 


Pro 


Phe 


Val 


Asp 






595 




Val 


Cvs 


Asn 


Lys 




610 






Leu 


Ala 


Cys 


Leu 


625 








Asn 


Leu 


Pro 


Asp 


Leu 


Val 


Phe 


Cys 








660 


Thr 


Lys 


Gly Lys 






675 




Ser 


Ser 


Ala 


Gly 




690 






He 


Leu 


Met Arg 


705 








Ser 


Tyr 


Phe 





325 








Ala 


Asn 


Leu 


Ser 


Ser 


Pro 


Tvr 


Leu 








360 


He 


Glu 


Leu 


Ala 






375 




Ala 






OCX- 




3 90 






Ala 


Cys 


Cys 


Phe 


405 








Glu 


Thr 


Asn 


Met 


Asn 


Thr 


Glu 


Gin 








440 


Tvr 


Glu 


Asp 


Pro 






455 




Ser 


Ala 


Phe 


Thr 




470 






Thr 


Pro 


He 


Val 


485 








Met 


Ser 


Leu 


Met 


Pro 


Asn 


Lys 


VdX 








Saw 


Phe 


Thr 


Val 


Ala 






535 




Leu 


Ala 


Phe 


T.ve 




550 




Val 


Ser Gly 


Thr 


565 










Val 


Leu 




He 


Asp 


Glu 


His 








600 


Glv 


Ser 


Val 


Thr 






615 




Ala 


Leu Gly 


Ser 




630 






Ala 


Phe 


Asn 


Glu 


645 








Ser 


Val 


Trp 


Val 


His 


Met 


Val 


Ala 








680 


Met 


Leu 


Glu 


Cys 






695 




Pro 


Glu Arg 


Asn 




710 









330 






Gin 


His 


Leu 


Gly 


345 








Pro 


His 


Glv 


Brn 


Thr 


Glv 












380 


Pro 


Glv 
uxy 


Phe 


Axy 






J -7 3 




Val 




Cor 


Pro 




410 






Asp 


Gin 


Cys 


Val 


425 








Asn 


Lys 


Cys 


He 


Leu 


Glv 


Met 


Ala 








** o u 


Ala 


Val 


Val 


Leu 






TB / 9 




Lvs 


Ala 


Art*) 






/on 

T* J V 






Phe 


Cys 


Phe 


Leu 


505 








He 


Cys 


Val 


Leu 


Val 


Cat* 




VClX 










Val 


Thr 


Asp 


Pro 






555 




Leu 


Asn 


Tyr 


Tl f* 

J. JatS 




3 / v 






Ala 


He 


Trp 


Leu 


585 








Ser 


Gin 


His 


m v 


Ala 


Phe 


Tvr 


t~ys 








con 
D a u 


Phe 


Thr 


Leu 


Ala 






635 




Ala 


Lys 


Phe 


Leu 




650 






Thr 


Phe 


Leu 


Pro 


665 








Val 


Glu 


He 


Phe 


He 


Phe 


Val 


Pro 








700 


Ser 


Thr 


Gin 


Lys 






715 





335 

He Lys Met Lys 
350 

His Ser His Leu 
365 

Lys Met Pro Ser 

Arg Leu Trp Lys 
400 

Cys Pro Glu Asn 
415 

Asn Cys Pro Glu 
430 

Gin Lys Gly Val 
445 

Leu Ala Leu Met 

Cys Val Phe Val 
480 

Arg Ser Leu Ser 
495 

Cys Ser Phe Phe 
510 

Gin Gin He Thr 
525 

Leu Ala Lys Thr 

Gly Arg Arg Leu 
560 

He Pro He Cys 
575 

Ala Val Ser Pro 
590 

His He He He 
605 

Val Leu Gly Tyr 

Phe Leu Ala Lys 
640 

Thr Phe Ser Met 
655 

Val Tyr His Ser 
670 

Ser He Leu Ala 
685 

Lys He Tyr He 

He Arg Glu Lys 
720 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1889 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
<ix) FEATURE: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



WO 99/00422 



-90- 



PCT/US98/13680 



GAATTCGGCT TCTGCACCAA ATGGCGACGA AAGACACATC TCTTTCACTT GCCATTGTTT 6 0 

CTTTGATGGT TCATTTTAGG TGGTCTTGGG TTGGTCTAAT TCTCCCAGAT GACCACAAAG 120 

GAAATAAAAT ACTATCAGAT TTTAGAAAGG AGATGGAAAG AAAAAGAATC TGTACGGCTT 180 

TTGTAAAAAT GATTCCTGCC ACATGGACTT CATCTTTTGT CAAATTCTGG GAAAATATGG 24 0 

ATGACACCAA CATAATAATT ATTTATGGTG ACATTGATTC TCTAGAAGGT CTAATGCGAA 300 

ATATTGGGCA AAGGTTATTG ACATGGCATG TCTGGGTCAT GAACATTGAA CCCCATATTA 360 

TTGAATATGA TAATTATTTC ATGTTAGATT CATTCCATGG AAGTTTAATT TTTAAGCACA 420 

ATTATAGAGA GAATTTTGAG TTTACCAAAT TTATTCGAAC AGTTAATCCT AAAAAATACC 48 0 

CAGAAGACAT TTATCTCCCT AAGATGTGGT ATTTGTTCTT CATGTGCTCA TTTTCTGATA 54 0 

TTAATTGTCA AGTTTTGGAC AGCTGTCAAA CAAATGCTTC TTTGGATATG TTACCTAGTC 600 

AGATATTTGA TGTGGTCATG AGTGAAGAGA GCACAAGTAT TT ACAATG CT GTGTACGCTG 660 

TGGCTCACAG CCTCCATGAG ATGAGACTTC AGCAACTTCA AACACAACCG TGTGAAAATG 720 

AAGAAGGGAT GGAGTTCTTT CCATGGCAGC TTAATACTTT CCTGAAGGAT ATTGAGGTGA 78 0 

GAGTCAACAG TTTAGACTGG AGACAGAGAA TAGATGCTGA ATATGACATT CTTAACCTCT 840 

GGAATTTACC AAAGGGTCTT GGACTAAAAG TGAAAATAGG AAACTTTTAT GCAAATGCTC 900 

CCCAGGGTCA ACAATTGTCT TTATCTGAAC AGATGATTCA ATGGCCAGAA ATATTTTCAG 960 

AGATCCCTCA GTCGGTGTGC AGTGAGAGTT GTGGGCCTGG ATTCAGGAAA GTAACCCTGG 1020 

AGAATAAGGC TATCTGCTGC TACAATTGTA CTCCCTGTGC AGACAATGAG ATTTCTAATG 1080 

AGACAGATGT AGACCAGTGT GTGAAGTGTC CAGAGAGTCA TTATGCAAAT ACAGAGAAGA 1140 

GCAACTGCTA TCAAAAGTCT GTGAGCTTTC TGGGCTATGA AGAGGGTTTG GGGATGGCTC 1200 

TAGCCAGCAT AGCTTTGTGC TTGTCTGCAC TAACTGCCTT TGTTATTGGC ATATTTGTGA 1260 

AACACAAAGA CACTCCTATT GTTAAGGCCA ATAATCAAGC TCTGAGTTAC ACTTTGCTCA 1320 

TCACACTCAA ATTCTGTTTC CTATGTTCTT TGAACTTCAT . TGGTCAGCCC AACACAGTTG .1380 

CCTGCATCCT TCAGCAGACC ACCTTTGCAG TTGCTTTCAC TATGGCTCTT GCCACTGTGT 1440 

TGGCCAAAGC TATCACTGTG GTTCTTGCCT TTAAGGTCAG TTTTCCAGGG AGAATGGTAA 1500 

GATGGCTAAT GATATCAAGG GGTCCAAACT ATATCATTCC TATCTGCACC CTGATCCAAC 1560 

TTCTTCTTTG TGGAATATGG ATGGCAATAT CTCCACCATA CATTGACCAA GATGCTCATA 162 0 

TTGAACATGG TCACATCATC ATTTTGTGCA ACAAGGGCTC AGCTGTTGCC TTCCACTCTG 1680 

TCCTGGGATA CCTCTGCTTC TTGGCCCTTG GGAGTTATAC CATGGCCTTC TTGTCAAGAA 1740 

ATTTGCCTGA TACATTCAAC GAATCCAAAT TTATCTCACT AAGTATGCTG GTATTCTTCT 1800 

GTGTCTGGAT CACCTTTCTT CCTGTCTACC ACAGCACTAA AGGGAAGGTC ATGGTCGCCG 1860 

TCGAGGTCTT TTGCATCCAA GCCGAATTC 1889 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 604 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



Ser 


Leu 


Ser 


Leu 


Ala 


He 


Val 


Ser 


Leu 


Met 


Val 


His Phe 


Arg 


Trp 


Ser 


1 








5 










10 








15 




Trp 


Val 


Gly 


Leu 


lie 


Leu 


Pro 


Asp 


Asp 


His 


Lys 


Gly Asn 


Lys 


He 


Leu 








20 










25 








30 






Ser 


Asp 


Phe 


Arg 


Lys 


Glu 


Met 


Glu 


Arg 


Lys 


Arg 


He Cys 


Thr 


Ala 


Phe 






35 










40 








45 








Val 


Lys 


Met 


lie 


Pro 


Ala 


Thr 


Trp 


Thr 


Ser 


Ser 


Phe Val 


Lys 


Phe 


Trp 




50 










55 










60 








Glu 


Asn 


Met 


Asp 


Asp 


Thr 


Asn 


He 


He 


He 


He 


Tyr Gly Asp 


He 


Asp 


65 










70 










75 








80 


Ser 


Leu 


Glu 


Gly 


Leu 


Met 


Arg 


Asn 


He 


Gly 


Gin 


Arg Leu 


Leu 


Thr 


Trp 










85 










90 








95 




His 


Val 


Trp 


Val 


Met 


Asn 


He 


Glu 


Pro 


His 


He 


He Glu 


Tyr 


Asp 


Asn 








100 










105 








110 






Tyr 


Phe 


Met 


Leu 


Asp 


Ser 


Phe 


His 


Gly 


Ser 


Leu 


He Phe 


Lys 


His 


Asn 






115 










120 








125 








Tyr 


Arg 


Glu 


Asn 


Phe 


Glu 


Phe 


Thr 


Lys 


Phe 


He 


Arg Thr 


Val 


Asn 


Pro 




130 










135 










140 








Lys 


Lys 


Tyr 


Pro 


Glu 


Asp 


He 


Tyr 


Leu 


Pro 


Lys 


Met Trp 


Tyr 


Leu 


Phe 


145 










150 










155 








160 
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Phe 


Met 


Cys 


Ser Phe 


Ser 


Asd 


He 


Asn 








165 










Gin 


Thr 


Asn 


Ala Ser 


Leu 


Asp 


Met 


Leu 








180 








185 


Val 


Met 


Ser 


Glu Glu 


Ser 


Thr 


Ser 


He 






195 








200 




Ala 


His 


Ser 


Leu His 


Glu 


Met 


Arg 


Leu 




210 








215 




Cys 


Glu 


Asn 


Glu Glu 


Glv 


Met 


Glu 


Phe 


225 








230 








Phe 


Leu 


Lys 


Asp He 


Glu 


Val 


Arg Val 








245 










Arg 


He Asp Ala Glu 


Tyr 


Asd 


He 


Leu 








260 








265 


Gly Leu Gly Leu Lys 


Val 


Lvs 


He Gly 






275 








280 




Gin Gly Gin Gin Leu 


Ser 


Leu 


Ser 


Glu 




290 














He 


Phe 


Ser 


Glu He 


Pro 


Gin 


Ser 


Val 


305 








310 








Gly Phe Arg Lys Val 


Thr 


Leu 


Glu Asn 








325 










Cys 


Thr 


Pro 


Cys Ala 


Asp 


Asn 


VJX u 


lie 








340 








345 


Gin 


Cys 


Val 


Lys Cys 


Pro 


Glu 


Ser 


His 






355 








360 




Asn 


Cys 


Tyr Gin Lys 


Ser 


Val 


Ser 


Phe 




370 








375 






Gly Met Ala 


Leu Ala 


Ser 


He 


Ala 


Leu 


385 








390 








Phe 


Val 


He 


Gly He 


Phe 


Val 


Lys 


His 








405 










Ala 


Asn 


Asn 


Gin Ala 


Leu 


Ser 


Tyr 


Thr 








420 








425 


Cys 


Phe 


Leu 


Cys Ser 


Leu 


Asn 


Phe 


He 






435 








440 




Cys 


He 


Leu 


Gin Gin 


Thr 


Thr 


Phe 


Ala 




450 








455 






Ala 


Thr 


Val 


Leu Ala 


Lys 


Ala 


He 


Thr 


465 








470 








Ser 


Phe 


Pro 


Gly Arg 


Met 


Val 


Arg 


Trp 








485 










Asn 


Tyr 


He 


He Pro 


He 


Cvs 


Thr 


Leu 








500 








505 


He 


Trp Met 


Ala He 


Ser 


Pro 


Pro 


Tyr 






515 








520 




Glu His Gly His He 


He 


He 


Leu 


Cys 




530 








535 




Phe 


His 


Ser 


Val Leu 


Gly 


Tyr 


Leu 


Cys 


545 








550 








Thr 


Met 


Ala 


Phe Leu 


Ser 


Arg 


Asn 


Leu 








565 










Lys 


Phe 


He 


Ser Leu 


Ser 


Met 


Leu 


Val 








580 








585 


Phe 


Leu 


Pro 


Val Tyr 


His 


Ser 


Thr 


Lys 






595 








600 



(2) INFORMATION FOR SEQ ID 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1889 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



PCT/US98/13680 

91 - 



Cys 


Gin 


Val 


Leu 


Asn 


Ser 


Cvs 


170 










175 




Pro 


Ser 


Gin 


He 


Phe 


Asp 


Val 










JL 7 V 






Tvr 
j 


Asn 


Ala 


Val 


Tvr 


Ala 


Veil 








205 








Gin 


Gin 


Leu 


Gin 


Thr 


Gin 


Pro 






220 










Phe 


Pro 


Trn 


Gin 


Leu 


Asn 


Thr 




235 












Asn 


Ser 


Leu 


Asn 


Trn 
c 




Gin 


250 










ice 




Asn 


Leu 




Asn 


Leu 


Pro 


Lys 










270 






Asn 


Phe 


Tyr 


Ala 


Asn 


Ala 


Pro 
















Gin 


Met 


He 


Gin 




Pro 








300 










Cvs 


Ser 


Glu 


Ser 


Cys 


Glv 


Pro 




315 










ion 


Lvs 


Ala 


He 








Moll 


330 










JJ9 




Ser 


Asn 


Glu 


Thr 


Aon 


val 


Asp 


















Ala 


Asn 


i. ll£T 


V3J.U 


Lys 


Ser 
















Leu 


Gly Tyr 


Glu 


Asp 


fro 


Leu 






380 










Cvs 


Leu 


Ser 


Ala 


Leu 


Thr 


Ala 




395 












Lvs 


Asp Thr 


Pro 


He 


Val 


Lys 


410 














Leu 


Leu 


He 


Thr 


Leu 


Lys 


Phe 
















Gly 


Gin 


Pro 


Asn 


Thr 


Val 


Ala 








445 








Val 


Ala 


Phe 


Thr 


Met 


Ala 


Leu 






460 










Val 


Val 


Leu 


Ala 


Phe 


Lys 


Val 




475 










** O \J 


Leu 


Met 


He 


Ser 


Arcr 


Glv 


Pro 


490 










473 




lie 


Gin 


Leu 


Leu 


Leu 


Cvs 


Glv 










510 






He 


Asp 


Gin 


Asp 


Ala 


His 


He 








525 








Asn 


Lys 


Gly 


Ser 


Ala 


Val 


Ala 






540 










Phe 


Leu 


Ala 


Leu 


Gly 


Ser 


Tyr 




555 










560 


Pro 


Asp 


Thr 


Phe 


Asn 


Glu 


Ser 


570 










575 




Phe 


Phe 


Cys 


Val 


Trp 


He 


Thr 










590 






Gly 


Lys 


Val 











NO: 13: 



WO 519/00422 



PCT/US98/13680 



'92' 

(ii) MOLECULE TYPE: cDNA 
(ix) FEATURE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

GAATTCGGCT TCTGCATCAA ATGGCGACGA AGGACACATC TCTTTCACTT GCCATTGTTT 6 0 

CTTTGATGGT TCATTTTAGG TGGTCTTGGG TTGGTCTAAT TCTCCCAGAT GACCACAAAG 120 

GAAATAAAAT ACTATCAGAT TTTAGAAAGG AGATGGAGAG AAAAAGAATC TGTACGGCTT 18 0 

TTGTAAAAAT GATTCCTGCC ACATGGACTT CATCTTTTGT CAAATTCTGG GAAAATATGG 240 

ATGACACCAA CATAATAATT ATTTATGGTG ACATTGATTC TCTAGAAGGT CCAATGCGAA 3 00 

ATATTGGGCA AAGGTTATTG ACATGGCATG TCTGGGTCAT GAACATTGAA CCCCATATTA 360 

TTGAATATGA TAATTATTTC ATGTTAGATT CATTCCATGG AAGTTTAATT TTTAAGCACA 420 

ATTATAGAGA GAATTTTGAG TTTACCAAAT TTATTCGAAC AGTTAATCCT AAAAAATACC 480 

CAGAAGACAT TTATCTCCCT AAGATGTGGT ATTTGTTCTT CATGTGCTCA TTTTCTGATA 540 

TTAATTGTCA AGTTTTGGAC AGCTGTCAAA CAAATGCTTC TTTGGATATG TTACCTAGTC 600 

AGATATTTGA TGTGGTCATG AGTGAAGAGA GCACAAGTAT TTACAATGCT GTGTACGCTG 660 

TGGCTCACAG CCTCCATGAG ATGAGACTTC AGCAACTTCA AACACAACCG TGTGAAAATG 720 

AAGAAGGGAT GGAGTTCTTT CCATGGCAGC TTAATACTTT CCTGAAGGAT ATTGAGGTGA 780 

GAGTCAACAG TTTGGACTGG AGACAGAGAA TAGATGCTGA ATATGACATT CTTAACCTCT 840 

GGAATTTACC AAAGGGTCTT GGACTAAAAG TGAAAATAGG AAACTTTTAT GCAAATGCTC 900 

CCCAGGGTCA ACAATTGTCT TTATCTGAAC AGATGATTCA ATGGCCAGAA ATATTTTCAG 960 

AAGTCCCTCA GTCTGTGTGC AGTGAGAGTT GTAGGCCTGG ATTCAGGAAA GTATCCCTGG 1020 

ATGATAAGGC CATCTGCTGC TACAAGTGCA CTCCTTGTGC CGACAATGAG ATATCTAATG 1080 

AGACAGATGT AGACCAGTGT GTGAAGTGTC CAGAGAGTCA TTATGCAAAT ACAGAGAAGA 114 0 

GCAACTGCTT CCCAAAATCT GTGAGCTTTC TGGCCTATGA AGACCCCTTG GGGATGGCTC 1200 

TAGCCAGCAT AGCTTTGTGC TTATCTGCAC TCACTGTCTT TGTTATTGGC ATCTTTGTGA 1260 

AAAACAGAGA CACTCCTATT GTCAAGGCCA ATAATCGGAC TCTAAGTTAC ATTTTGCTCA 1320 

TCACACTCAC CTTTTGTTTC TTATGTTCTT TGAACTTCAT TGGTCAGCCC AACACAGCTG 1380 

CCTGCATCCT TCAGCAGACC ACCTTTGCAG TTGCTTTCAC TATGGCTCTT GCCACTGTGT 1440 

TGGCCAAAGC TATTACTGTA GTCCTTGCCT TTAAGATCAG TTTTCCAGGG AGAATGTTAA 1500 

GGTGGCTAAT GATATCAAGG GGTCCAAGAT ACATCATTCC TATCTGCACA CTGATCCAGC 1560 

TTCTTCTTTG TGGAATATGG ATGGCAACTT CTCCACCATT CATTGACCAA GATGTTAATA 1620 

CTGAAGATGG ATACATCATC CTTTTGTGCA ACAAGGGCTC AGCTGTTGCC TTCCATTCAG 1680 

TCCTGGGATA CCTCTGTTTC TTGGCCCTTG GGAGTTATAC CATGGCCTTC TTGTCTAGAA 1740 

ATTTGCCTGA TACATTCAAT GAATCCAAAT TTCTGTCATT CAGTATGCTG GTGTTCTTCT 1800 

GTGTCTGGGT CACCTTTCTT CCTGTCTACC ACAGCACTAA AGGGAAAGTT ATGGTCGTCG 1860 

TCGAAGTCTT CTGCATCCAA GCCGAATTC 1889 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 604 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 



Ser 


Leu 


Ser 


Leu 


Ala 


He 


Val 


Ser 


Leu 


Met 


Val 


His Phe Arg 


Trp 


Ser 


1 








5 










10 






15 




Trp 


Val 


Gly 


Leu 


He 


Leu 


Pro 


Asp 


Asp 


His 


Lys 


Gly Asn Lys 


He 


Leu 








20 










25 






30 






Ser 


Asp 


Phe 


Arg 


Lys 


Glu 


Met 


Glu 


Arg 


Lys 


Arg 


He Cys Thr 


Ala 


Phe 






35 










40 








45 






val 


Lys 


Met 


He 


Pro 


Ala 


Thr 


Trp 


Thr 


Ser 


Ser 


Phe Val Lys 


Phe 


Trp 




50 










55 










60 






Glu 


Asn 


Met 


Asp 


Asp 


Thr 


Asn 


He 


He 


He 


He 


Tyr Gly Asp 


He 


Asp 


65 










70 










75 






80 


Ser 


Leu 


Glu 


Gly 


Pro 


Met 


Arg 


Asn 


He 


Gly 


Gin 


Arg Leu Leu 


Thr 


Trp 










85 










90 






95 




His 


Val 


Trp 


Val 


Met 


Asn 


He 


Glu 


Pro 


His 


He 


He Glu Tyr Asp 


Asn 








100 










105 






110 
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Tyr 


Phe 


Met 


Leu 


Asp 


Ser 


Phe 


His 


Glv 


Ser 


Leu 




triits 


Lys 


XllS 


Asn 






115 










120 










X .£3 






Tyr 


ATQ 


Glu 


Asn 


Phe 


Glu 


Phe 


Thr 


Lys 


Phe 


lie 
11c 


Arg 


X XIX 


\7a 1 
Veil 


Asn 


pro 




130 










135 










14 w 










Lys 


Lys 


Tyr 


Pro 


Glu 


Asp 


He 


Tvr 


Leu 


Pro 


Lvs 


Met 




Tyr 


Lieu 


riie 


145 










150 










133 






160 


Phe 


Met 


Cys 


Ser 


Phe 


Ser 


Asp 


He 


Asn 


Cvs 


Gin 


Val 


Leu 


Asp 


Ser 


Cys 










165 




















175 


Gin 


Thr 


Asn 


Ala 


Ser 


Leu 


Asp 


Met 


Leu 


Pro 


OCX 


Gin 


xxe 


xrlie 


Asp 


vai 








180 










185 














Val 


Met 


Ser 


Glu 


Glu 


Ser 


Thr 


Ser 


He 


Tvr 


Ann 


Al a 
>vl cl 


Val 


Tyr 


AT a 
iua 


vax 






195 




















205 






Ala 


His 


Ser 


Leu 


His 


Glu 


Met 




Leu 


Gin 


Gin 

VJ1U 


Leu 


uin 


Thr 


Gin 


pro 




210 










215 




















Cys 


Glu 


Asn 


Glu 


Glu 


Gly 


Met 


Glu 


Phe 


Phe 


Pro 


Tm 


will 


Leu 


Asn 


Thr 


225 










230 










« j 3 








240 


Phe 


Leu 


Lvs 


Asp 


He 


Glu 


Val 


**-*-y 


Val 


Asn 


OCX 


L6U 


Asp 


Trp 


Arg 


Gin 










245 




















0 c c 






He 


Asp 


Ala 


Glu 


Tvr 


Asp 


He 


Leu 


Asn 


Leu 


xrp 


Asn 


Leu 


Pro 


Lys 








260 




















270 




Glv 


Leu 


Glv 


Leu 


Lys 


Val 


Lys 


lie 


Gly 


Asn 


■fixe 


Tyr 


ax a 


Asn 


Ala 


Pro 






275 










^ O Is 








285 








Gin 


Glv 


Gin 


Gin 


Leu 


Ser 


Leu 


Ser 


U1U 




wee 


lie 


Gin 


Trp 


Pro 


Glu 




290 










295 










3UU 








lie 


Phe 


Ser 


Glu 


Val 


Pro 


Gin 


Ser 


Val 


Cys 


Ser 


uiu 


Ser 


Cys 


Arg 


Pro 


305 










310 
















^ 0 n 

320 


Gly 


Phe 


Arcr 


Lvs 


Val 


Ser 


Leu 


Asp 


Asp 


Lys 


Ala 


Tl 
lie 


cys 


Cys 


Tyr 


Lys 










325 




















*a c 
335 


Cys 


Thr 


Pro 


Cvs 


Ala 


Asp 


Asn 


Glu 


He 


Ser 




url U 


1. XIX 


Asp 


vax 


Asp 








340 




















^ c n 




Gin 


Cvs 


Val 


Lys 


Cvs 


Pro 


Glu 


Ser 


His 


Tvr 


AJ>a 


Asn 


1 XIX 


VjJLU 


Lys 


Ser 






355 










JDU 










•a £ c 
Job 






Asn 


Cvs 


Phe 


Pro 


Lys 


Ser 


Val 


Ser 


IT 


T if^n 

J-JsZ Li 


Ala 


ryr 


ulu 


Asp 


Pro 


Leu 




370 










375 










3ftn 

•SOU 








Gly 


Met 


Ala 


Leu 


Ala 


Ser 


He 


Ala 


Leu 


Cys 


Leu 


Ser 


Ala 


Leu 


xxxx 


Val 
V ct 1 


385 










390 




















4UU 


Phe 


Val 


He 


Glv 


He 


Phe 


Val 


Lys 


Asn 




Asp 


Thr 


Prn 


Tl a 
11C 


Val 


Lys 










405 










410 










Al C 
4X3 


Ala 


Asn 


Asn 


Arcr 


Thr 


Leu 


Ser 


Tvr 


He 


Leu 


Leu 


lie 




Leu 


JLITx 


riie 








420 










Tt A 3 










4JU 






Cys 


Phe 


Leu 


Cvs 


Ser 


Leu 


Asn 


Phe 


He 


Glv 


Gin 




A on 
noil 


X XIX 


Al a 


Al a 
nXa 






435 










w u 










A A C 

44 b 








Cys 


He 


Leu 


Gin 


Gin 


Thr 


Thr 


Phe 


Ala 


Val 


Ala 


Phe 


Thr 

X XXX 


Met 


Al a 
Aid 


Leu 




450 










455 










a tin 










Ala 


Thr 


Val 


Leu 


Ala 


Lvs 


Ala 


He 


Thr 


Val 


Val 


Leu 


Ala 


JrXlC 


Lys 


Tl 0 
lie 


465 










470 


















A O 
4 O U 


Ser 


Phe 


Pro 


Glv 


Arg 


Met 


Leu 




Trt> 


Leu 


Met 


Tie 
11c 


ser 


Arg 


vaiy 


Pro 










485 




















A oc 
493 




Arg 


Tyr 


He 


He 


Pro 


He 


Cys 


Thr 


Leu 


He 


Gin 


Leu 


Leu 


Leu 


Cys 


Gly 








500 










505 










510 


He 


Trp 


Met 


Ala 


Thr 


Ser 


Pro 


Pro 


Phe 


He 


Asp 


Gin 


Asp 


Val 


Asn 


Thr 






515 










520 








525 








Glu 


Asp 


Gly 


Tyr 


He 


He 


Leu 


Leu 


Cys 


Asn 


Lys 


Gly 


Ser 


Ala 


Val 


Ala 




530 










535 










540 










Phe 


His 


Ser 


Val 


Leu 


Gly 


Tyr 


Leu 


Cys 


Phe 


Leu 


Ala 


Leu 


Gly 


Ser 


Tyr 


545 










550 










555 








560 


Thr 


Met 


Ala 


Phe 


Leu 


Ser 


Arg 


Asn 


Leu 


Pro 


Asp 


Thr 


Phe 


Asn 


Glu 


Ser 










565 










570 










575 




Lye 


Phe 


Leu 


Ser 


Phe 


Ser 


Met 


Leu 


Val 


Phe 


Phe 


Cys 


Val 


Trp 


Val 


Thr 








580 










585 










590 






Phe 


Leu 


Pro 


Val 


Tyr 


His 


Ser 


Thr 


Lys 


Gly 


Lys 


Val 











595 600 



(2) INFORMATION FOR SEQ ID NO: 15: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2561 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(ix) FEATURE: 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 80. . .349 

(D) OTHER INFORMATION: VR8 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

ATAGGTGCAA CTGTGTGTGT GATGTTTTTC TACATCAGAA ACGGATTTCA CAACAGCTCC 60 

ATCTTAGATC CTAGCAGAC ATG AAG AAG CTC TGT GCT TTC ACG ATT TCA TTG 112 

Met Lys Lys Leu Cys Ala Phe Thr lie Ser Leu 

1 C in 

TTG TTT CTG AAG TTT TCT CTC ATC TTG TGC TGT TGG AGT GAA CCA AGT 160 
Leu Phe Leu Lys Phe Ser Leu lie Leu Cys Cys Trp Ser Glu Pro Ser 
15 20 25 

TGC TTT TGG AGG ATA AAG AAT AGT GAT GAT AAT GAC GGA GAT TTG CAA 208 
Cys Phe Trp Arg lie Lys Asn Ser Asp Asp Asn Asp Gly Asp Leu Gin 
30 35 40 

AGG GAA TGT CAT TTT TAC CTT GGG GCA GCT GAT ACA CCA GTT GAA GAT 256 
Arg Glu Cys His Phe Tyr Leu Gly Ala Ala Asp Thr Pro Val Glu Asp 
45 50 55 

AAT TTT TAT AGT TCA CTT TTA AAA TTT AGG TTT TCT TTG GAC CAT TTA 304 
Asn Phe Tyr Ser Ser Leu Leu Lys Phe Arg Phe Ser Leu Asp His Leu 
60 65 70 75 

ATC CTA ACC TAC GCG ACC ATG ACC GGC TGC CCC ATG TCC ATC AGG TAGCC 354 
lie Leu Thr Tyr Ala Thr Met Thr Gly Cys Pro Met Ser He Arg 
80 85 90 

CCCAAGGACA CACATTTGTC CCATGGCATG GTCTCCTTGA TGTTTCACTT TAGATGGACT 414 

TGGATAGGAA TGGTCATCTC AGATGATGAC CAGGGTATTC AGTTTCTCTC AGATTTAAGA 474 

GAAGAAAGCC AAAGGCATGG GATCTGTTTA GCTTTTGTTA ATATGATCCC AGAAAACATG 534 

CAGATATACA TGACAAGGGC TACAATATAT GATCAACAAA TTATGACATC TTCAGCAAAG 594 

GTTGTTATCA TTTATGGTGA AATGAACTCT ACTCTAGAAG TAAGCTTTAG AAGATGGGAA 654 

GAGTTAGGTG CTCGGAGAAT CTGGATCACA ACCTCACAAT GGGATGTCAT CACAAATAAA 714 

AAAGACTTCA CCCTTAATCT CTTCCATGGG ACTATCACTT TTGCACACCA CAGAGTTGAG 774 

ATTCCTAAAT TAAATAAATT CATGCAAACA ATGAACACTG CCAAATACCC AGTAGATATT 834 

TCTCATACTA TATTGGAGTG GAATTATTTT AATTGTTCAA TATCTAAGAA CAGCATTAGA 894 

ATGCATCATA TTACATTCAA CAACACCTTG GAATGGACAT CACTGCACAA CTATGATATG 954 

GCGATGAGTG ATGAAGGTTA CAGTTTATAT AATGCTGTTT ATGCTGTGGC CCACACCTAC 1014 

CATGAATACA TTTTTCAACA AGTAGAGTCT CAGAAAAAGG CAAAACCCAA AAGATATTTC 1074 

ACTGCTTGTC AGCAGCCTCA GGTTCCCTCC TCCGTGTGTA GTGTGGCATG TACTGCTGGA 1134 

TTCAGGAAAA TTTATCAAAA AGAAACAGCA GACTGCTGCT TTGATTGTGT TCAGTGCCCA 1194 

GAAAATGAGA TTTCCAACGA AACAGATATG GAACAGTGTG TGAGGTGTCC AGATGATAAG 1254 

TATGCCAACA TAG AG CAAAC CCACTGCCTC TCAAGAGCTG TATCATTTCT GGCTTATGAA 1314 

GATCCATTGG GGATGGCTCT AGGCTGCATG GCACTGTCCT TCTCGGCCAT CACAATTCTA 1374 

GTCCTCGTCA CATTTGTGAA ACACAACGAT ACTCCCATTG TGAAGGCCAA TAACCGCATT 1434 

CTCAGCTACA TCCTGCTCAT CTCTCTCGTC TTCTGCTTTC TCTGCTCCCT GCTCTTCATT 1494 

GGACCTCCCG ACCAGGTCAC CTGCATCTTG CAGCAGACCA CATTTGGAGT ATTTTTCACT 1554 

GTGTCTGTTT CTACAGTGTT GGCCAAAACA ATAACTGTGG TCATGGCTTT CAAGCTCACT 1614 

ACTCCAGGAA GAAGGATGAG AGGGATGATG ATGACAGGGG CACCTAAGTT GGTCATTCCC 1674 

ATTTGTACCC TGATCCAACT TGTTCTCTGT GGAATCTGGT TGGTCACATC TCCTCCCTTT 1734 

ATTGACAGAG ATATACAATC TGAGCATGGG AAGATTGTCA TTCTTTGCAA TAAAGGCTCA 1794 
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GTCATTGCCT TCCACGTCGT CCTGGGATAC TTGGGCTCCT TGGCTCTGGG GAGCTTCACT 1854 

TTGGCTTTCT TGGCTAGGAA CCTTCCTGAC ACATTCAATG AAGCCAAGTT CCTAACTTTC 1914 

AGCATGCTGG TGTTCTGCAG TGTCJTGGATC ACCTTCCTCC CTGTCTACCA CAGCACCAGG 1974 

GGGAGGGTCA TGGTGGTTGT GGAGGTTTTC TCCATCTTGG CTTCTAGTGC AGGGTTGCTA 2034 

ATGTGTATCT TTGTCCCAAA GTGTTATGTT ATTTTAATTA GACCAGATTC AAATATTATA 2094 

AAGAAACATA AAGGTAAAGT GCTTAATTGA AACTTTCATG GTATGAAAAT GTTAGATGAT 2154 

ATTCAACTTA TCTTATTCTT CATCTTAATA AAAGCAGTAC TTCATCATAT AAAAAATAAA 2214 

GTAATATACA GATTTATACT TACAAACTGG ACAGCAAACA TGAATATGTT GAGAACTGGG 2274 

ATTCTCAATT QAGGAATGGC TACCAACATT TTGATCTGTG GTTTTGTGTT TAAGCCATGC 2334 

ACTTAATTAA TGATTAACAT GAGGTTACCC TACTGTCTGT GAACAGCGCC ACCTCTAGGC 2394 

ATGCTGTCCT TGAGTTATAA GAAAGGGTAC TGCATACACA ATGGACATGA AGCCAGTAAT 2454 

CAACATTATT CCACTTGCTT TCATGGAGTT CTTACTTCCA AGTTCATGCC TTGACTTTAT 2514 

TCAATGTTCT ATGACAAAGG TAGATAAATA AATAAACACT TTTCCTC 2561 



(2) INFORMATION FOR SEQ ID NO: 16: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 90 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
<D> TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 



Met 


Lys 


Lys 


Leu 


Cys 


Ala 


Phe 


Thr 


lie 


Ser 


Leu 


Leu 


Phe 


Leu 


Lys 


Phe 


1 








5 










10 










15 




Ser 


Leu 


lie 


Leu 
20 


Cys 


Cys 


Trp 


Ser 


Glu 
25 


Pro 


Ser 


Cys 


Phe 


Trp 
30 


Arg 


He 


Lys 


Asn 


Ser 


Asp 


Asp 


Asn 


Asp 


Gly 


Asp 


Leu 


Gin 


Arg 


Glu 


Cys 


His 


Phe 






35 










40 










45 






Tyr 


Leu 


Gly 


Ala 


Ala 


Asp 


Thr 


Pro 


Val 


Glu 


Asp 


Asn 


Phe 


Tyr 


Ser 


Ser 




50 










55 








60 








Leu 


Leu 


Lys 


Phe 


Arg 


Phe 


Ser 


Leu 


Asp 


His 


Leu 


He 


Leu 


Thr 


Tyr 


Ala 


65 










70 










75 








80 


Thr 


Met 


Thr 


Gly 


Cys 
85 


Pro 


Met 


Ser 


He 


Arg 
90 















(2) INFORMATION FOR SEQ ID NO: 17: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2734 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
{ ix) FEATURE ; 



(A) NAME/KEY: Coding Sequence 
<B) LOCATION: 80... 1387 
(D) OTHER INFORMATION: VR9 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

ATAGGTGCAA CTGTGTGTGT GATGTTTTTC TACATCAGAA ACGGATTTCA CAACAGCTCC 60 
ATCTTAGATC CTAGCAGAC ATG AAG AAG CTC TGT GCT TTC ACG ATT TCA TTG 112 

Met Lys Lys Leu Cys Ala Phe Thr He Ser Leu 
1 5 10 

TTG TTT CTG AAG TTT TCT CTC ATC TTG TGC TGT TGG AGT GAA CCA AGT 160 
Leu Phe Leu Lys Phe Ser Leu He Leu Cys Cys Trp Ser Glu Pro Ser . 
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15 20 25 

TGC TTT TGG AGG ATA AAG AAT AGT GAT GAT AAT GAC GGA GAT TTG CAA 208 
Cys Phe Trp Arg lie Lys Asn Ser Asp Asp Asn Asp Gly Asp Leu Gin 
30 35 40 

AGG GAA TGT CAT TTT TAC CTT GGG GCA GCT GAT ACA CCA GTT GAA GAT 256 
Arg Glu Cys His Phe Tyr Leu Gly Ala Ala Asp Thr Pro Val Glu Asp 
45 50 55 

AAT TTT TAT AGT TCA CTT TTA AAA TTT AGA ATT GCA GCA AGT GAA TAT 304 
Asn Phe Tyr Ser Ser Leu Leu Lys Phe Arg lie Ala Ala Ser Glu Tyr 
60 65 70 75 

GAG TTT CTT CTC GTA ATG TTT TTT GCT ATC GAT GAG ATC AAC AGG AAT 352 
Glu Phe Leu Leu Val Met Phe Phe Ala lie Asp Glu lie Asn Arg Asn 
80 85 90 

CCT TAT CTT TTA CCC AAC ATA ACT TTG ATG TTC TCC TTC ATT GGT GGA 400 

Pro Tyr Leu Leu Pro Asn lie Thr Leu Met Phe Ser Phe lie Gly Gly 
95 100 105 

AAC TGT CAG GAT TTA TTG AGA GTT ATG GAC CAA GCA TAT ACA CAA ATA 448 
Asn Cys Gin Asp Leu Leu Arg Val Met Asp Gin Ala Tyr Thr Gin lie 
110 115 120 

AAT GGA CAT ATG AAT TTT GTT AAT TAT TTC TGT TAT TTA GAT GAT TCA 496 
Asn Gly His Met Asn Phe Val Asn Tyr Phe Cys Tyr Leu Asp Asp Ser 
125 130 135 

TGT GCC ATA GGT CTT ACA GGA CCA TCA TGG AAA ACT TCC TTA AAA CTG 544 
Cys Ala lie Gly Leu Thr Gly Pro Ser Trp Lys Thr Ser Leu Lys Leu 
140 145 150 155 

GCA ATG CAC TCT TCG ATG CCA CTG GTT TTC TTT GGA CCA TTT AAT CCT 592 
Ala Met His Ser Ser Met Pro Leu Val Phe Phe Gly Pro Phe Asn Pro 
160 165 170 

AAC CTA CGC GAC CAT GAC CGG CTG CCC CAT GTC CAT CAG GTA GCC CCC 640 
Asn Leu Arg Asp His Asp Arg Leu Pro His Val His Gin Val Ala Pro 
175 180 185 

AAG GAC ACA CAT TTG TCC CAT GGC ATG GTC TCC TTG ATG TTT CAC TTT 688 
Lys Asp Thr His Leu Ser His Gly Met Val Ser Leu Met Phe His Phe 
190 195 200 

AGA TGG ACT TGG ATA GGA ATG GTC ATC TCA GAT GAT GAC CAG GGT ATT 736 
Arg Trp Thr Trp He Gly Met Val He Ser Asp Asp Asp Gin Gly He 
205 210 215 

CAG TTT CTC TCA GAT TTA AGA GAA GAA AGC CAA AGG CAT GGG ATC TGT 784 
Gin Phe Leu Ser Asp Leu Arg Glu Glu Ser Gin Arg His Gly He Cys 
220 225 230 235 

TTA GCT TTT GTT AAT ATG ATC CCA GAA AAC ATG CAG ATA TAC ATG ACA 832 
Leu Ala Phe Val Asn Met He Pro Glu Asn Met Gin He Tyr Met Thr 
240 245 250 

AGG GCT ACA ATA TAT GAT CAA CAA ATT ATG ACA TCT TCA GCA AAG GTT 880 
Arg Ala Thr He Tyr Asp Gin Gin He Met Thr Ser Ser Ala Lys Val 
255 260 265 

GTT ATC ATT TAT GGT GAA ATG AAC TCT ACT CTA GAA GTA AGC TTT AGA 928 
Val He He Tyr Gly Glu Met Asn Ser Thr Leu Glu Val Ser Phe Arg 
270 275 280 



WO 99/00422 



PCT/US98/13680 



-97- 

AGA TGG GAA GAG TTA GGT GCT CGG AGA ATC TGG ATC ACA ACC TCA CAA 
Arg Trp Glu Glu Leu Gly Ala Arg Arg lie Trp lie Thr Thr Ser Gin 
285 290 295 



AAA TTC ATG CAA ACA ATG AAC ACT GCC AAA TAC CCA GTA GAT ATT TCT 
Lys Phe Met Gin Thr Met Asn Thr Ala Lys Tyr Pro Val Asp lie Ser 
335 340 345 



976 



TGG GAT GTC ATC ACA AAT AAA AAA GAC TTC ACC CTT AAT CTC TTC CAT 1024 
Trp Asp Val He Thr Asn Lys Lys Asp Phe Thr Leu Asn Leu Phe His 
300 305 310 315 

GGG ACT ATC ACT TTT GCA CAC CAC AGA GTT GAG ATT CCT AAA TTA AAT 1072 
Gly Thr He Thr Phe Ala His His Arg Val Glu He Pro Lys Leu Asn 
320 325 330 



1120 



CAT ACT ATA TTG GAG TGG AAT TAT TTT AAT TGT TCA ATA TCT AAG AAC 1168 
His Thr He Leu Glu Trp Asn Tyr Phe Asn Cys Ser He Ser Lys Asn 
350 355 360 

AGC ATT AGA ATG CAT CAT ATT ACA TTC AAC AAC ACC TTG GAA TGG ACA 1216 
Ser He Arg Met His His He Thr Phe Asn Asn Thr Leu Glu Trp Thr 
365 370 375 

TCA CTG CAC AAC TAT GAT ATG GCG ATG AGT GAT GAA GGT TAC AGT TTA 1264 
Ser Leu His Asn Tyr Asp Met Ala Met Ser Asp Glu Gly Tyr Ser Leu 
380 385 390 395 

TAT AAT GCT GTT TAT GCT GTG GCC CAC ACC TAC CAT GAA TAC ATT TTT 1312 
Tyr Asn Ala Val Tyr Ala Val Ala His Thr Tyr His Glu Tyr He Phe 
400 405 410 

CAA CAA GTA GAG TCT CAG AAA AAG GCA AAA CCC AAA AGA TAT TTC ACT 1360 
Gin Gin Val Glu Ser Gin Lys Lys Ala Lys Pro Lys Arg Tyr Phe Thr 
415 420 425 

GCT TGT CAG CAG ATA TGG AAC AGT GTG TGAGGTGTCC AGATGATAAG TATGCCA 1414 
Ala Cys Gin Gin He Trp Asn Ser Val 
430 435 



ACATAGAGCA 
TGGGGATGGC 
TCACATTTGT 
ACATCCTGCT 
CCGACCAGGT 
TTTCTACAGT 
GAAGAAGGAT 
CCCTGATCCA 
GAGATATACA 
CCTTCCACGT 
TCTTGGCTAG 
TGGTGTTCTG 
TCATGGTGGT 
TCTTTGTCCC 
ATAAAGGTAA 
TTATCTTATT 
ACAGATTTAT 
ATTGAGGAAT 
TAATGATTAA 
CCTTGAGTTA 
ATTCCACTTG 
TCTATGACAA 



AACCCACTGC 
TCTAGGCTGC 
GAAACACAAC 
CATCTCTCTC 
CACCTGCATC 
GTTGGCCAAA 
GAGAGGGATG 
ACTTGTTCTC 
ATCTGAG CAT 
CGTCCTGGGA 
GAACCTTCCT 
CAGTGTCTGG 
TGTGGAGGTT 
AAAGTGTTAT 
AGTGCTTAAT 
CTTCATCTTA 
ACTTACAAAC 
GGCTACCAAC 
CATGAGGTTA 
TAAGAAAGGG 
CTTTCATGGA 
AGGTAGATAA 



CTCTCAAGAG 
ATGGCACTGT 
GATACTCCCA 
GTCTTCTGCT 
TTGCAGCAGA 
ACAATAACTG 
ATGATGACAG 
TGTGGAATCT 
GGGAAGATTG 
TACTTGGGCT 
GACACATTCA 
ATCACCTTCC 
TTCTCCATCT 
GTTATTTTAA 
TGAAACTTTC 
ATAAAAGCAG 
TGGACAGCAA 
ATTTTGATCT 
CCCTACTGTC 
TACTGCATAC 
GTTCTTACTT 
ATAAATAAAC 



CTGTATCATT 
CCTTCTCGGC 
TTGTGAAGGC 
TTCTCTGCTC 
CCACATTTGG 
TGGTCATGGC 
GGGCACCTAA 
GGTTGGTCAC 
TCATTCTTTG 
CCTTGGCTCT 
ATGAAGCCAA 
TCCCTGTCTA 
TGGCTTCTAG 
TTAGACCAGA 
ATGGTATGAA 
TACTTCATCA 
ACATGAATAT 
GTGGTTTTGT 
TGTGAACAGC 
ACAATGGACA 
CCAAGTTCAT 
ACTTTCCTCA 



TCTGGCTTAT 
CATCACAATT 
CAATAACCGC 
CCTGCTCTTC 
AGTATTTTTC 
TTTCAAGCTC 
GTTGGTCATT 
ATCTCCTCCC 
CAATAAAGGC 
GGGGAGCTTC 
GTTCCTAACT 
CCACAGCACC 
TGCAGGGTTG 
TTCAAATATT 
AATGTTAGAT 
TATAAAAAAT 
GTTGAGAACT 
GTTTAAGCCA 
GCCACCTCTA 
TGAAGCCAGT 
GCCTTGACTT 
CAAAAAAAAA 



GAAGATCCAT 
CTAGTCCTCG 
ATTCTCAGCT 
ATTGGACCTC 
ACTGTGTCTG 
ACTACTCCAG 
CCCATTTGTA 
TTTATTGACA 
TCAGTCATTG 
ACTTTGGCTT 
TTCAGCATGC 
AGGGGGAGGG 
CTAATGTGTA 
ATAAAGAAAC 
GATATTCAAC 
AAAGTAATAT 
GGGATTCTCA 
TGCACTTAAT 
GGCATGCTGT 
AATCAACATT 
TATTCAATGT 
AAAAAAAAAA 



1474 
1534 
1594 
1654 
1714 
1774 
1834 
1894 
1954 
2014 
2074 
2134 
2194 
2254 
2314 
2374 
2434 
2494 
2554 
2614 
2674 
2734 
2734 



(2) INFORMATION FOR SEQ ID NO:l8: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 436 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

£ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: 



Met 


Lys 


T ,vo 


Leu 


Cys 


Al a 
nld 


irne 


Thr 


He 


Ser 


T **** 

Leu 


Leu 


Phe 


Leu 


Lys 


Phe 










c 




















15 




Ser 


Leu 


Tl O 

lie 


Leu 


Cys 


Cys 


Trp 


Ser 


Glu 


Pro 


Ser 


Cys 


Phe 


Trp Arg 


He 








5ft 










25 










30 






Lys 


Asn 


Ser 


Asp 


Asp 


Asn 


ASp 


Gly Asp 


Leu 


Gin 


Arg 


Glu 


Cys 


His 


Phe 






35 










40 








45 








Leu 


Gly 


Ala 


Ala 


Asp 


TVit- 

ixir 


Pro 


Val 


ulU 


Asp 


Asn 


Phe 


Tyr 


Ser 


Ser 




50 










55 










£ ft 

b U 








Leu 


Leu 


Lvs 


Phe 


Arcr 


He 


Ala 


Ala 




CZl ii 














65 










70 










/ D 










80 


Met 


Phe 


Phe 


Ala 


He 






He 


Asn 


Arg 


Asn 


Pro 


Tyr 


Leu 


Leu 


Pro 










85 










on 
j \j 








95 




Asn 


lie 


Thr 


Leu 


Met 


Phe 


Coy* 


Pne 


lie 


uiy 




Asn 


Cys 


Gin 


Asp 


Leu 








100 










105 










110 




Leu 




Val 
115 


Met 


Asp 


Gin 




Tyr 
120 


Thr 




ij.e 


Asn 


Gly 
125 


His 


Met 


Asn 


Phe 


Val 


Asn 


Tyr 




Cys 


Tyr 


Leu 


Asp 


Asp 


Ser 


Cys 


Ala 


He 


Gly 


Leu 




130 




















140 








Thr 


Glv 


Pro 


Ser 




Lys 


J. XIX 


Ser 


Leu 


Lys 


Leu 


TV "1 _ 

Ala 


Met 


His 


Ser 


Ser 


145 










150 










155 










160 


Met 


Pro 


Leu 


Val 


Phe 


Phe 


Ol v 


Pro 


Pne 


Asn 


Pro 


Asn 


Leu 


Arg 


Asp 


His 










165 










170 








175 




Aso 


Arcr 


Leu 


Pro 


His 


Val 


His 


Gin 


Val 


2Vl a 
Ala 


Pro 


Lys 


Asp 


Thr 


His 


Leu 








180 










185 








190 






Ser 


His 


Glv 


Met 


Val 


Ser 


Leu 


Met 


Phe 


nxs 




Arg 


Trp 


Thr 


Trp 


lie 






195 










200 














Gly 


Met 


Val 


He 


Ser 


Asp 


Asp 


Asp 


Gin 


Glv 








Leu 


Gar 


Asp 




210 










215 










220 








Leu 


Aro 


Glu 


Glu 


Ser 


Gin 


Ar-g 


His 


Gly 


Tl o 

lie 


Cys 


Leu 


>tia 


Phe 


Val 


Asn 


225 










230 










235 










240 


Met 


He 


Pro 


Glu 


Asn 


Met 


Gin 


He 


Tyr 


Met 


Thr 


Arg 


Ala 


Thr 


He 


Tyr 










245 










250 










255 


Asp 


Gin 


Gin 


He 


Met 


Thr 


Ser 


Ser 


Ala 


Lys 


Val 


val 


He 


He 


Tyr 


Gly 








260 










265 










270 


Glu 


Met 


Asn 


Ser 


Thr 


Leu 


Glu 


Val 


Ser 


Phe 


Arg 


Arg 


Trp 


Glu 


Glu 


Leu 






275 










280 








285 








Gly 


Ala 


Arg 


Arg 


He 


Trp 


He 


Thr 


Thr 


Ser 


Gin 


Trp 


Asp 


Val 


He 


Thr 




290 










295 










300 








Asn 


Lys 


Lys 


Asp 


Phe 


Thr 


Leu 


Asn 


Leu 


Phe 


His 


Gly 


Thr 


He 


Thr 


Phe 


305 










310 










315 








320 


Ala 


His 


His 


Arg 


Val 


Glu 


He 


Pro 


Lys 


Leu 


Asn 


Lys 


Phe 


Met 


Gin 


Thr 










325 










330 








335 




Met 


Asn 


Thr 


Ala 
340 


Lys 


Tyr 


Pro 


Val 


Asp 
345 


He 


Ser 


His 


Thr 


He 
350 


Leu 


Glu 


Trp 


Asn 


Tyr 


Phe 


Asn 


Cys 


Ser 


He 


Ser 


Lys 


Asn 


Ser 


He 


Arg 


Met 


His 






355 










360 










365 






His 


He 


Thr 


Phe 


Asn 


Asn 


Thr 


Leu 


Glu 


Trp 


Thr 


Ser 


Leu 


His 


Asn 


Tyr 




370 










375 










380 








Asp 


Met 


Ala 


Met 


Ser 


Asp 


Glu 


Gly 


Tyr 


Ser 


Leu 


Tyr 


Asn 


Ala 


Val 


Tyr 


385 










390 










395 










400 


Ala 


Val 


Ala 


His 


Thr 
405 


Tyr 


His 


Glu 


Tyr 


He 
410 


Phe 


Gin 


Gin 


val 


Glu 
415 


Ser 


Gin 


Lys 


Lys 


Ala 


Lys 


Pro 


Lys 


Arg 


Tyr 


Phe 


Thr 


Ala 


Cys 


Gin 


Gin 


He 



420 425 430 
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Trp Asn Ser Val 
435 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 732 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: cDNA 
(ix) FEATURE: 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 80... 1375 

(D) OTHER INFORMATION: VR10 



(xi) SEQUENCE DESCRIPTION: SEQ ID NQ?19; 

ATAGTTGTAA ATGTGTGTGT GATGTTTTTC TACATCAGAA ACGGATTTCA CAACAACTCC 60 
ATCTTAGATC CTAGCAGAC ATG AAG AAG CTC TGT GCT TTC ACT ATT TCA TTT 112 

Met Lys Lys Leu Cys Ala Phe Thr lie Ser Phe 
15 10 

TTG TCT CTG AAG TTT TCT CTC ATC TTG TGC TGT TTG ACT GAA GCA AGT 160 
Leu Ser Leu Lys Phe Ser Leu lie Leu Cys Cys Leu Thr Glu Ala Ser 
15 20 25 

TGC TTT TGG AGG ATA AAG AAT AGT GAA GAT AGT GAT GGA GAT TTG CAA 208 
Cys Phe Trp Arg lie Lys Asn Ser Glu Asp Ser Asp Gly Asp Leu Gin 
30 35 40 

AGA GAA TGT CAT TTT TAC CTT TGG GTA ATT GAT AAA CCT ATT GAA GAT 256 
Arg Glu Cys His Phe Tyr Leu Trp Val lie Asp . Lys Pro lie Glu Asp 
45 50 55 

AAT TTT TAT AAT TCA GTT TTA AAT TTT AGA ATA TCA GCA AGT GAA TAT 304 
Asn Phe Tyr Asn Ser Val Leu Asn Phe Arg lie Ser Ala Ser Glu Tyr 
60 65 70 75 

GAG TTT CTT CTG GTA ATG TTT TTT GCT ACT GAT GAG ATC AAC AAG AAT 352 
Glu Phe Leu Leu Val Met Phe Phe Ala Thr Asp Glu lie Asn Lys Asn 
80 85 90 

CCT TAT CTT TTA CCC AAC ATA ACT TTG ATA TTC AGC ATC GTT GGT GGT 400 
Pro Tyr Leu Leu Pro Asn lie Thr Leu lie Phe Ser lie Val Gly Gly 
95 100 105 

CAC TGT CAT GAT TTA TTG AGA GGT CTG GAT CAA TCA TAT ACA CAA ATA 448 
His Cys His Asp Leu Leu Arg Gly Leu Asp Gin Ser Tyr Thr Gin lie 
110 115 120 

AAT GGA CGT GTG AAT TTT GTT AAT TAT TTC TGT TAT TTA GAT GAT TCA 496 
Asn Gly Arg Val Asn Phe Val Asn Tyr Phe Cys Tyr Leu Asp Asp Ser 
125 130 135 

TGT AAC ATA GGC CTT ACA GGA CCA TCA TGG AAA AAA TCC TTA AAA CTG 544 
Cys Asn lie Gly Leu Thr Gly Pro Ser Trp Lys Lys Ser Leu Lys Leu 
140 145 150 155 

GCA ATG GAT- TCT TCA ATA CCA ATG GTT TTC TTT GGA CCA TTT AAT CCT 592 
Ala Met Asp Ser Ser lie Pro Met Val Phe Phe Gly Pro Phe Asn Pro 
160 165 170 



WOS9/00422 



- 100- 



PCT/US98/13680 



AAC CTA CGC GAC CAT GAC CGG CTG CCC CAT GTC CAT CAG GTA GCC CCC 640 
Asn Leu Arg Asp His Asp Arg Leu Pro His Val His Gin Val Ala Pro 
175 180 185 

AAG GAC ACA CAT TTA TCC CAT GGC ATG GTC TCC TTG ATG TTT CAT TTT 688 
Lys Asp Thr His Leu Ser His Gly Met Val Ser Leu Met Phe His Phe 
190 195 200 

AGA TGG ACT TGG ATA GGA CTG GTC ATC TCA GAT GAT GAC CAG GGT ATT 736 
Arg Trp Thr Trp lie Gly Leu Val He Ser Asp Asp Asp Gin Gly He 
205 210 215 

CAG TTT CTC TCA GAT TTA AGA GAA GAA AGC CAA AGG CAT GGG ATC TGT 784 
Gin Phe Leu Ser Asp Leu Arg Glu Glu Ser Gin Arg His Gly He Cys 
220 225 230 235 

TTA GCT TTT GTT AAT ATG ATC CCA GAA AAC ATG CAG ATA TAC ATG ACA 832 
Leu Ala Phe Val Asn Met He Pro Glu Asn Met Gin He Tyr Met Thr 
240 245 250 

AGG GCT ACA ATA TAT GAT AAA CAA ATT ATG ACA TCT TCA GCA AAG GTT 880 
Arg Ala Thr He Tyr Asp Lys Gin He Met Thr Ser Ser Ala Lys Val 
255 260 265 

GTT ATC ATT TAT GGT GAA ATG AAC TCT ACT CTA GAA GTA AGC TTC AGA 928 
Val He He Tyr Gly Glu Met Asn Ser Thr Leu Glu Val Ser Phe Arg 
270 275 280 

AGA TGG GAA GAT TTA GGT GCT CGG AGA ATC TGG ATC ACA ACC TCA CAA 976 
Arg Trp Glu Asp Leu Gly Ala Arg Arg He Trp He Thr Thr Ser Gin 
285 290 295 

TGG GAT ATC ATA TTA AAT AAA AAA GAA TTC ACT CTT AAT CTC TTC CAT 1024 
Trp Asp He He Leu Asn Lys Lys Glu Phe Thr Leu Asn Leu Phe His 
300 305 310 315 

GGC CCT ATC ACT TTT GCA CAC CAC AAA GTT GAG ATT CCT AAA TTA AGG 1072 
Gly Pro He Thr Phe Ala His His Lys Val Glu He Pro Lys Leu Arg 
320 325 330 

AAT TTT ATG CAA ACA ATG AAC ACT GCC AAA TAC CCA GTA GAT ATT TCT 1120 
Asn Phe Met Gin Thr Met Asn Thr Ala Lys Tyr Pro Val Asp He Ser 
335 340 345 

CAT ACT ATA CTG GAG TGG AAT TAT TTT AAT TGT TCA ATC TCT AAG AAC 1168 
His Thr He Leu Glu Trp Asn Tyr Phe Asn Cys Ser He Ser Lys Asn 
350 355 360 

AGC AGT AAA ATG GAT CTT TTT ACA TCC AAC AAC ACA TTG GAA TGG ACA 1216 
Ser Ser Lys Met Asp Leu Phe Thr Ser Asn Asn Thr Leu Glu Trp Thr 
365 370 375 

GCA CTG CAC AAC TAT GAT ATG GCC ATG AGT GAT GAA GGT TAC AAT TTG 1264 
Ala Leu His Asn Tyr Asp Met Ala Met Ser Asp Glu Gly Tyr Asn Leu 
380 385 390 395 

TAT AAT GCT GTT TAT GTT GCG GCC CAC ACC TAC CAT GAA CAC ATT CTT 1312 
Tyr Asn Ala Val Tyr Val Ala Ala His Thr Tyr His Glu His He Leu 
400 405 410 

CAA CAA GTA GAG TCT CAG AAA AAG GTA GAA CAC AAC AGA TAT TTC ACT 1360 
Gin Gin Val Glu Ser Gin Lys Lys Val Glu His Asn Arg Tyr Phe Thr 
415 420 425 



GTT TGT CAG CAG ATA TAGAACAGTG TGTGAAATGT CCAGATGATA AGTATGCCAA C 1416 
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Val Cys Gin Gin He 
430 

ATAGAACAAA CCTACTGCCT CTCAAGAGCT GTATCATTTC TGGCTTTTGA AGAACCACTG 1476 

GGGATGGCTC TAGGCTGCAT GGCACTATCC TTCTCGGCCA TCACAATTCT AGTACTAGTC 1536 

ACATTTGTGA AGTACAAGAA TACTCCCATT GTGAAGGCCA ATAACCGCAT TCTCAGCTAC 1596 

ATCCTGCTCA TCTCTCTAGT CTTCTGTTTT CTCTGCTCCC TGCTCTTCAT TGGACATCCT 1656 

GACCAGGTCA CCTGCATCTT GCAGCAGACC ACATTTGGAG TATTTTTCAC TGTGTCTGTT 1716 

TCTACAGTGT TGGCCAAAAC AATAACTGTG GTCATGGCTT TCAAGTTCAC TACTCCAGGA 1776 

AGAAGGATGA GAGGGATGTT GGTAACAGGT GCACCTAAGT TGGTCATTCC CATTTGTACC 1836 

CTAATCCAAC TTGTTCTCTG TGGAATCTGG TTGGTAACAT CTCCTCCATT TATTGACAGA 1896 

GATATACAAT CTGAACATGG GAAGGTAGTC ATTCTTTGCA ATAAAGGCTC TGTCATTGCC 1956 

TTCCACATTG TCCTGGGATA CTTGGGCTCC TTGGCTCTGG GGAGCTTCAC TTTGGCTTTC 2 016 

TTGGCTAGGA ACCTTCCTGA CACATTCAAT GAAGCCAAAT TCCTAACTTT CAGCATGCTG 2076 

GTGTTCTGCA GTGTCTGGAT CACCTTCCTC CCTGTCTACC ACAGCACCAG GGGGAAGGTC 2136 

ATGGTGGTTG TGGAGGTTTT CTCAATCTTG GCTTCTAGTG CAGGGTTGCT AATGTGTATC 2196 

TTTGTCCCAA AGTGTTATGT TATTTTAGTT AGACCAGATT CAAATTTTAC AAAGAACCGC 2256 

AAAGGTAAAT TGCTTTATTG AAATTTTCAT GGTATGAAAA TGTTAGATTA TATTCAACTT 2316 

ATCTTATTCT TCATCTTAAC AAAAGTAGTA CTTCATCATA TAAAAAATTA AGTAATATAC 2376 

AGATTTATAC TTACAAACTG GACAGCAAAC ATGAATATGT TTAGAACTGG GAATCTCAAT 2436 

TGAGGAATGG GTATCATCAT TTTGACCTGT GGTTATGTGT TTAAGCCATG TGTTTAATTA 2496 

ATGATTAACA TGAGGTTGCC CTACTGTCTG TGAACCATAC CACCTCTAGG CACACTGTCC 2556 

TTGAGTTATA AGATAGGGTA CTGCATACAA AATGGACATG AAACCAGTAA TCAACATTAT 2616 

CCCTCTTGCT TTCATGGAGT TCTTGCATCC AATTTCATGC CTTGACTTCA TTCAATGTAC 2676 

TATGACAAAG GTACATAAAT AAATAAACAC TTTCCCCACC AAAAAAAAAA AAAAAA 2732 

(2) INFORMATION FOR SEQ ID NO: 20: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 432 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 



Met 


Lys 


Lys 


Leu 


Cys 


Ala 


Phe 


Thr 


He 


Ser 


Phe 


Leu 


Ser 


Leu 


Lys 


Phe 


1 








5 










10 










15 




Ser 


Leu 


He 


Leu 
20 


Cys 


Cys 


Leu 


Thr 


Glu 
25 


Ala 


Ser 


Cys 


Phe 


Trp 
30 


Arg 


He 


Lys 


Asn 


Ser 
35 


Glu 


Asp 


Ser 


Asp 


Gly 
40 


Asp 


Leu 


Gin 


Arg 


Glu 
45 


Cys 


His 


Phe 


Tyr 


Leu 
50 


Trp 


Val 


He 


Asp 


Lys 
55 


Pro 


He 


Glu 


Asp 


Asn 
60 


Phe 


Tyr 


Asn 


Ser 


Val 


Leu 


Asn 


Phe 


Arg 


He 


Ser 


Ala 


Ser 


Glu 


Tyr 


Glu 


Phe 


Leu 


Leu 


Val 


65 










70 










75 










80 


Met 


Phe 


Phe 


Ala 


Thr 
85 


Asp 


Glu 


He 


Asn 


Lys 
90 


Asn 


Pro 


Tyr 


Leu 


Leu 
95 


Pro 


Asn 


He 


Thr 


Leu 
100 


He 


Phe 


Ser 


He 


Val 
105 


Gly 


Gly 


His 


Cys 


His 
110 


Asp 


Leu 


Leu 


Arg 


Gly 
115 


Leu 


Asp 


Gin 


Ser 


Tyr 
120 


Thr 


Gin 


He 


Asn 


Gly 
125 


Arg 


Val 


Asn 


Phe 


Val 
130 


Asn 


Tyr 


Phe 


Cys 


Tyr 
135 


Leu 


Asp 


Asp 


Ser 


Cys 
140 


Asn 


He 


Gly 


Leu 


Thr 


Gly 


Pro 


Ser 


Trp 


Lys 


Lys 


Ser 


Leu 


Lys 


Leu 


Ala 


Met 


Asp 


Ser 


Ser 


145 










150 










155 










160 


He 


Pro 


Met 


val 


Phe 
165 


Phe 


Gly 


Pro 


Phe 


Asn 
170 


Pro 


Asn 


Leu 


Arg 


Asp 
175 


His 


Asp 


Arg 


Leu 


Pro 
180 


His 


Val 


His 


Gin 


Val 
185 


Ala 


Pro 


Lys 


Asp 


Thr 
190 


His 


Leu 


Ser 


His 


Gly 
195 


Met 


Val 


Ser 


Leu 


Met 
200 


Phe 


His 


Phe 


Arg 


Trp 
205 


Thr 


Trp 


He 
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Gly 


Leu 


Val 


He 


Ser 


Asp 


Asp 


Asp 


Gin 


Gly 


He 


Gin Phe 


Leu 


Ser 


Asp 




210 










215 










220 






Leu 


Arg 


Glu 


Glu 


Ser 


Gin 


Arg 


His 


Gly 


He 


Cys 


Leu Ala 


Phe 


Val 


Asn 


225 










230 










235 








240 


Met 


He 


Pro 


Glu 


Asn 


Met 


Gin 


He 


Tyr 


Met 


Thr 


Arg Ala 


Thr 


He 


Tyr 










245 










250 








a 3 O 


Asp 


Lys 


Gin 


He 


Met 


Thr 


Ser 


Ser 


Ala 


LVS 


Val 


Val He 


He 


y *• 


Gly 








260 










265 








270 


Glu 


Met 


Asn 


Ser 


Thr 


Leu 


Glu 


Val 


Ser 


Phe 


Arcr 


Arg Trp 


Glu 


ASp 


Leu 






275 










280 








OQC 
^ O 3 






Gly 


Ala 
290 


Arg 


Arg 


He 


Trp 


He 
295 


Thr 


Thr 


Ser 


Gin 


Trp Asp 
300 


He 


He 


Leu 


Asn 


Lys 


Lys 


Glu 


Phe 


Thr 


Leu 


Asn 


Leu 


Phe 


His 


Gly Pro 


He 


Thr 


Phe 


305 










310 










315 








Ala 


His 


His 


Lys 


Val 
325 


Glu 


He 


Pro 


LVS 


Leu 
330 


Arcr 


Asn Phe 


Met 


Gin 
335 


Thr 


Met 


Asn 


Thr 


Ala 
340 


Lys 


Tyr 


Pro 


Val 


Asp 
345 


He 


Ser 


His Thr 


He 
350 


Leu 


Glu 


Trp 


Asn 


Tyr 


Phe 


Asn 


Cys 


Ser 


He 


Ser 


Lys 


Asn 


Ser Ser 


Lys 


Met 


Asp 






355 










360 








365 




Leu 


Phe 


Thr 


Ser 


Asn 


Asn 


Thr 


Leu 


Glu 


Trp 


Thr 


Ala Leu 


His 


Asn 


Tyr 




370 










375 










380 






Asp 


Met 


Ala 


Met 


Ser 


Asp 


Glu 


Gly 


Tyr 


Asn 


Leu 


Tyr Asn Ala 


Val 


Tyr 


385 










390 










395 








400 


Val 


Ala 


Ala 


His 


Thr 
405 


Tyr 


His 


Glu 


His 


He 
410 


Leu 


Gin Gin 


Val 


Glu 
415 


Ser 


Gin 


Lys 


Lys 


Val 


Glu 


His 


Asn 


Arg 


Tyr 


Phe 


Thr 


Val Cys 


Gin 


Gin 


He 








420 










425 






430 







(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2962 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(ix) FEATURE: 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 81... 1601 

(D) OTHER INFORMATION: VR11 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: 

CATAGTTGTA AATGTGTGTG TGATGTTTTT CTACATCAGA AACGGATTTC ACAACAACTC 60 
CATCTTAGAT CC TAG CAGAC ATG AAG AAG CTC TGT GCT TTC ACT ATT TCA 110 

Met Lys Lys Leu Cys Ala Phe Thr He Ser 
15 10 

TTT TTG TCT CTG AAG TTT TCT CTC ATC TTG TGC TGT TTG ACT GAA GCA 158 
Phe Leu Ser Leu Lys Phe Ser Leu He Leu Cys Cys Leu Thr Glu Ala 
15 20 25 

AGT TGC TTT TGG AGG ATA AAG AAT AGT GAA GAT AGT GAT GGA GAT TTG 206 
Ser Cys Phe Trp Arg He Lys Asn Ser Glu Asp Ser Asp Gly Asp Leu 
30 35 40 

CAA AGA GAA TGT CAT TTT TAC CTT TGG GTA ATT GAT AAA CCT ATT GAA 254 
Gin Arg Glu Cys His Phe Tyr Leu Trp Val He Asp Lys Pro He Glu 
45 50 55 



GAT AAT TTT TAT AAT TCA GTT TTA AAT TTT AGA ATA TCA GCA AGT GAA 



302 
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Asp Asn Phe Tyr Asn Ser Val Leu Asn Phe Arg He Ser Ala Ser Glu 
60 65 70 

TAT GAG TTT CTT CTG GTA ATG TTT TTT GCT ACT GAT GAG ATC AAC AAG 350 
Tyr Glu Phe Leu Leu Val Met Phe Phe Ala Thr Asp Glu lie Asn Lys 
75 80 85 90 

AAT CCT TAT CTT TTA CCC AAC ATA ACT TTG ATA TTC AGC ATC GTT GGT 398 
Asn Pro Tyr Leu Leu Pro Asn He Thr Leu He Phe Ser He Val Gly 
95 100 105 

GGT CAC TGT CAT GAT TTA TTG AGA GGT CTG GAT CAA TCA TAT ACA CAA 446 
Gly His Cys His Asp Leu Leu Arg Gly Leu Asp Gin Ser Tyr Thr Gin 
110 H5 120 

ATA AAT GGA CGT GTG AAT TTT GTT AAT TAT TTC TGT TAT TTA GAT GAT 4 94 

He Asn Gly Arg Val Asn Phe Val Asn Tyr Phe Cys Tyr Leu Asp Asp 
125 130 135 

TCA TGT AAC ATA GGC CTT ACA GGA CCA TCA TGG AAA AAA TCC TTA AAA 542 
Ser Cys Asn He Gly Leu Thr Gly Pro Ser Trp Lys Lys Ser Leu Lys 
140 145 150 

CTG GCA ATG GAT TCT TCA ATA CCA ATG GTT TTC TTT GGA CCA TTT AAT 590 
Leu Ala Met Asp Ser Ser He Pro Met Val Phe Phe Gly Pro Phe Asn 
I 55 160 165 170 

CCT AAC CTA CGC GAC CAT GAC CGG CTG CCC CAT GTC CAT CAG GTA GCC 638 
Pro Asn Leu Arg Asp His Asp Arg Leu Pro His Val His Gin Val Ala 
175 180 185 

CCC AAG GAC ACA CAT TTA TCC CAT GGC ATG GTC TCC TTG ATG TTT CAT 686 
Pro Lys Asp Thr His Leu Ser His Gly Met Val Ser Leu Met Phe His 
190 195 200 

TTT AGA TGG ACT TGG ATA GGA CTG GTC ATC TCA GAT GAT GAC CAG GGT 734 
Phe Arg Trp Thr Trp He Gly Leu Val He Ser Asp Asp Asp Gin Gly 
205 210 215 

ATT CAG TTT CTC TCA GAT TTA AGA GAA GAA AGC CAA AGG CAT GGG ATC 782 
He Gin Phe Leu Ser Asp Leu Arg Glu Glu Ser Gin Arg His Gly He 
220 225 230 

TGT TTA GCT TTT GTT AAT ATG ATC CCA GAA AAC ATG CAG ATA TAC ATG 830 
Cys Leu Ala Phe Val Asn Met He Pro Glu Asn Met Gin He Tyr Met 
235 240 245 250 

ACA AGG GCT ACA ATA TAT GAT AAA CAA ATT ATG ACA TCT TCA GCA AAG 878 
Thr Arg Ala Thr He Tyr Asp Lys Gin He Met Thr Ser Ser Ala Lys 
255 260 265 

GTT GTT ATC ATT TAT GGT GAA ATG AAC TCT ACT CTA GAA GTA AGC TTC 926 
Val Val He He Tyr Gly Glu Met Asn Ser Thr Leu Glu Val Ser Phe 
270 275 280 

AGA AGA TGG GAA GAT TTA GGT GCT CGG AGA ATC TGG ATC ACA ACC TCA 974 
Arg Arg Trp Glu Asp Leu Gly Ala Arg Arg He Trp He Thr Thr Ser 
285 290 295 

CAA TGG GAT ATC ATA TTA AAT AAA AAA GAA TTC ACT CTT AAT CTC TTC 1022 
Gin Trp Asp He He Leu Asn Lys Lys Glu Phe Thr Leu Asn Leu Phe 
300 305 310 



CAT GGC CCT ATC ACT TTT GCA CAC CAC AAA GTT GAG ATT CCT AAA TTA 
His Gly Pro He Thr Phe Ala His His Lys Val Glu He Pro Lys Leu 



1070 
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315 320 325 330 

AGG AAT TTT ATG CAA ACA ATG AAC ACT GCC AAA TAC CCA GTA GAT ATT 1118 
Arg Asn Phe Met Gin Thr Met Asn Thr Ala Lys Tyr Pro Val Asp lie 
335 340 345 

TCT CAT ACT ATA CTG GAG TGG AAT TAT TTT AAT TGT TCA ATC TCT AAG 1166 
Ser His Thr He Leu Glu Trp Asn Tyr Phe Asn Cys Ser He Ser Lys 
350 355 360 

AAC AGC AGT AAA ATG GAT CTT TTT ACA TCC AAC AAC ACA TTG GAA TGG 1214 
Asn Ser Ser Lys Met Asp Leu Phe Thr Ser Asn Asn Thr Leu Glu Trp 
365 370 375 

ACA GCA CTG CAC AAC TAT GAT ATG GCC ATG AGT GAT GAA GGT TAC AAT 1262 
Thr Ala Leu His Asn Tyr Asp Met Ala Met Ser Asp Glu Gly Tyr Asn 
380 385 390 

TTG TAT AAT GCT GTT TAT GTT GCG GCC CAC ACC TAC CAT GAA CAC ATT 1310 
T-s\i Tyr Asn Ala Val Tyr Val Ala Ala His Thr T"r His Glu His 11c 
395 400 405 410 

CTT CAA CAA GTA GAG TCT CAG AAA AAG GTA GAA CAC AAC AGA TAT TTC 1358 
Leu Gin Gin Val Glu Ser Gin Lys Lys Val Glu His Asn Arg Tyr Phe 
415 420 425 

ACT GTT TGT CAG CAG GTA TCT TCC TTG ATG AAA ACC AGG GTA TTT ACG 1406 
Thr Val Cys Gin Gin Val Ser Ser Leu Met Lys Thr Arg Val Phe Thr 
430 435 440 

AAC CCG GTT GGA GAA CTG GTG AAC ATG AAG CAT AGG GAA AAT CAG TGT 1454 
Asn Pro Val Gly Glu Leu Val Asn Met Lys His Arg Glu Asn Gin Cys 
445 450 455 

ACA GAG TAT GAT ATT TTC ATC ATT TGG AAT TTT CCA CAA GGC CTT GGA 1502 
Thr Glu Tyr Asp He Phe He He Trp Asn Phe Pro Gin Gly Leu Gly 
460 465 470 

TTA AAA TTG AAA ATA GGA AGC TAT ATA CCT TGT TTT CCA AAG AGT CAA 1550 
Leu Lys Leu Lys He Gly Ser Tyr He Pro Cys Phe Pro Lys Ser Gin 
475 480 485 490 

CAA CTT CAT ATA TCT GAT GAT TTG GAA TGG GCC ATG GGA GGA ACA TCA 1598 
Gin Leu His . He Ser Asp Asp Leu Glu Trp Ala Met Gly Gly Thr Ser 
495 500 505 

ATA TAGAACAGTG TGTGAAATGT CCAGATGATA AGTATGCCAA CATAGAACAA ACCTAC 1657 
He 



TGCCTCTCAA GAGCTGTATC ATTTCTGGCT TTTGAAGAAC CACTGGGGAT GGCTCTAGGC 1717 

TGCATGGCAC TATCCTTCTC GGCCATCACA ATTCTAGTAC TAGTCACATT TGTGAAGTAC 1777 

AAGAATACTC CCATTGTGAA GGCCAATAAC CGCATTCTCA GCTACATCCT GCTCATCTCT 1837 

CTAGTCTTCT GTTTTCTCTG CTCCCPGCTC TTCATTGGAC ATCCTGACCA GGTCACCTGC 18 97 

ATCTTGCAGC AGACCACATT TGGAGTATTT TTCACTGTGT CTGTTTCTAC AGTGTTGGCC 1957 

AAAACAATAA CTGTGGTCAT GGCTTTCAAG TTCACTACTC CAGGAAGAAG GATGAGAGGG 2017 

ATGTTGGTAA CAGGTGCACC TAAGTTGGTC ATTCCCATTT GTACCCTAAT CCAACTTGTT 2077 

CTCTGTGGAA TCTGGTTGGT AACATCTCCT CCATTTATTG ACAGAGATAT ACAATCTGAA 2137 

CATGGGAAGG TAGTCATTCT TTG C AAT AAA GGCTCTGTCA TTGCCTTCCA CATTGTCCTG 2197 

GGATACTTGG GCTCCTTGGC TCTGGGGAGC TTCACTTTGG CTTTCTTGGC TAGGAACCTT 2257 

CCTGACACAT TCAATGAAGC CAAATTCCTA ACTTTCAGCA TGCTGGTGTT CTGCAGTGTC 2317 

TGGATCACCT TCCTCCCTGT CTACCACAGC ACCAGGGGGA AGGTCATGGT GGTTGTGGAG 2377 

GTTTTCTCAA TCTTGGCTTC TAGTGCAGGG TTGCTAATGT GTATCTTTGT CCCAAAGTGT 2437 

TATGTTATTT TAGTTAGACC AGATTCAAAT TTTACAAAGA ACCGCAAAGG TAAATTGCTT 2497 

TATTGAAATT TTCATGGTAT GAAAATGTTA GATTATATTC AACTTATCTT ATTCTTCATC 2557 
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TTAACAAAAG TAGTACTTCA TCATATAAAA AATTAAGTAA TATACAGATT TATACTTACA 2617 

AACTGGACAG CAAACATGAA TATGTTTAGA ACTGGGAATC TCAATTGAGG AATGGGTATC 2677 

ATCATTTTGA CCTGTGGTTA TGTGTTTAAG CCATGTGTTT AATTAATGAT TAACATGAGG 2737 

TTGCCCTACT GTCTGTGAAC CATACCACCT CTAGGCACAC TGTCCTTGAG TTATAAGATA 2797 

GGGTACTGCA TACAAAATGG ACATGAAACC AGTAATCAAC ATTATCCCTC TTGCTTTCAT 2857 

GGAGTTCTTG CATCCAATTT CATGCCTTGA CTTCATTCAA TGTACTATGA CAAAGGTACA 2917 

TAAATAAATA AACACTTTCC CCACAAAAAA AAAAAAAAAA AAAAA 2962 



(2) INFORMATION FOR SEQ ID NO: 22: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 507 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 22: 



Met 


Lys 


Lys 


Leu 


1 








Ser 


Leu 


He 


Leu 








2 0 


Lys 


Asn 


Ser 


Glu 










Tyr 


Leu 


Trp 


val 




50 






Val 


Leu 


Asn 


Phe 


65 








Met 


Phe 


Phe 


Ala 


Asn 


He 


Thr 


Leu 








100 


Leu 


Arg 


Gly 


Leu 






115 




Phe 


Val 


Asn 


Tyr 




130 






Thr 


Gly 


Pro 


Ser 


145 








lie 


Pro 


Met 


Val 


Asp 


Arg 


Leu 


Pro 








180 


Ser 


His 


Gly 


Met 






195 




Gly 


Leu 


Val 


He 




210 






Leu 


Arg 


Glu 


Glu 


225 








Met 


He 


Pro 


Glu 


Asp 


Lys 


Gin 


He 








260 


Glu 


Met 


Asn 


Ser 






275 




Gly 


Ala 


Arg 


Arg 




290 






Asn 


Lys 


Lys 


Glu 


305 








Ala 


His 


His 


Lys 


Met 


Asn 


Thr 


Ala 



340 



Cys 


Ala 


Phe 


Thr 


5 








Cys 


Cys 


Leu 


Thr 


Asp 


Ser 


Asp 


Gly 








4 U 


xie 


Asp 


Lys 


Fro 






DO 




Arg 


lie 


O A V 






70 






Thr 


Asp 


Glu 


He 


85 








He 


Phe 


Ser 


He 


Asp 


Gin 


Ser 


Tyr 








120 


Phe 


Cys 


Tyr 


Leu 






135 




Trp 


Lys 


Lys 


Ser 




150 






Phe 


Phe 


Gly 


Pro 


165 








His 


Val 


His 


Gin 


Val 


Ser 


Leu 


Met 








200 


Ser 


Asp 


Asp 


Asp 






215 




Ser 


Gin 


Arg 


His 




230 






Asn 


Met 


Gin 


He 


245 








Met 


Thr 


Ser 


Ser 


Thr 


Leu 


Glu 


Val 








280 


He 


Trp 


He 


Thr 






295 




Phe 


Thr 


Leu 


Asn 




310 






Val 


Glu 


He 


Pro 


325 








Lys 


Tyr 


Pro 


Val 



lie 


Ser 


Pne 


Leu 




10 






Glu 


Ala 


Ser 


Cys 










Asp 


Leu 


Gin 


Arg 


lie 




ASp 


Asn 








D \J 


Cot- 


ulu 


Tyr 








75 




Asn 


Lys 


Asn 


Pro 




90 






Val 


Gly 


Gly 


His 


105 








Thr 


Gin 


He 


Asn 


Asp 


Asp 


Ser 


Cys 








140 


Leu 


Lys 


Leu 


Ala 






155 




Phe 


Asn 


Pro 


Asn 




170 






Val 


Ala 


Pro 


Lys 


185 








Phe 


His 


Phe 


Arg 


Gin 


Gly 


He 


Gin 








220 


Gly 


He 


Cys 


Leu 






235 




Tyr 


Met 


Thr 


Arg 




250 






Ala 


Lys 


Val 


Val 


265 








Ser 


Phe 


Arg 


Arg 


Thr 


Ser 


Gin 


Trp 








300 


Leu 


Phe 


His 


Gly 






315 




Lys 


Leu 


Arg 


Asn 




330 






Asp 


He 


Ser 


His 



345 



Ser Leu Lys Phe 
15 

Phe Trp Arg He 
30 

Glu Cys His Phe 
45 

Phe Tyr Asn Ser 

Phe Leu Leu Val 
80 

Tyr Leu Leu Pro 
95 

Cys His Asp Leu 
110 

Gly Arg Val Asn 
125 

Asn He Gly Leu 

Met Asp Ser Ser 
160 

Leu Arg Asp His 
175 

Asp Thr His Leu 
190 

Trp Thr Trp He 
205 

Phe Leu Ser Asp 

Ala Phe Val Asn 
240 

Ala Thr He Tyr 
255 

He He Tyr Gly 
270 

Trp Glu Asp Leu 
285 

Asp He He Leu 

Pro He Thr Phe 
320 

Phe Met Gin Thr 
335 

Thr He Leu Glu 
350 
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Trp 


Asn 


Tyr 

"> c c 


Phe 


Asn 


Cys 


Ser He Ser 

1 C A 


Lys 


Asn 


Ser 


Ser 
365 


Lys 


Met 


Asp 


Leu 






Ser 


Asn 


Asn 


X Ilx JjcU ulu 


Trp 


Thr 


Ala 


Leu 


His 


Asn 


Tyr 




370 










375 






380 








Asp 


Met 


Ala 


Met 


Ser 


Asp 


Glu Gly Tyr 


Asn 


Leu 


Tyr Asn 


Ala 


Val 


Tyr 


38 5 










3 90 






395 










400 


Val 


Ala 


Ala 


His 


Thr 
405 


Tyr 


His Glu His 


He 
410 


Leu 


Gin 


Gin 


Val 


Glu 
415 


Ser 


Gin 


Lys 


Lys 


Val 
420 


Glu 


His 


Asn Arg Tyr 
425 


Phe 


Thr 


Val 


Cys 


Gin 
430 


Gin 


Val 


Ser 


Ser 


Leu 


Met 


Lys 


Thr 


Arg Val Phe 


Thr 


Asn 


Pro 


Val 


Gly Glu 


Leu 






435 








440 








445 








Val 


Asn 
450 


Met 


Lys 


His 


Arg 


Glu Asn Gin 
455 


Cys 


Thr 


Glu 
460 


Tyr 


Asp 


He 


Phe 


He 


He 


Trp 


Asn 


Phe 


Pro 


Gin Gly Leu 


Gly Leu Lys 


Leu 


Lys 


He 


Gly 


465 










470 






475 










480 


Ser 


Tyr 


He 


Pro 


Cys 
485 


Phe 


Pro Lys Ser 


Gin 
490 


Gin 


Leu 


His 


He 


Ser 
495 


Asp 


Asp 


Leu 


Glu 


Trp 

500 


Ala 


Met 


Gly Gly Thr 
505 


Ser 


He 













(2) INFORMATION FOR SEQ ID NO: 23: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2821 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : cDNA 
(ix) FEATURE: 

(A) NAME /KEY : Coding Sequence 

(B) LOCATION: 60... 992 

(D) OTHER INFORMATION: VR12 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

GACGTTTTTC TGCATCAGAA ACGGATTTCA CAGCAGCTCC ATCTCAGATC CTAGCAGAC A 60 

Me 



TGA AGC AGC TCT GCA CTT TCA CTA TTT CAT TGT TGT TTC TGA AGT TTT 108 
t Lys Gin Leu Cys Thr Phe Thr He Ser Leu Leu Phe Leu Lys Phe Se 
15 10 15 

CTC TCA TCT TGT GCT GTT GGA GTG AAC CAA GCT GCT TTT GGA GGA TAA 156 
r Leu He Leu Cys Cys Trp Ser Glu Pro Ser Cys Phe Trp Arg He Ly 
20 25 30 

AGA AGA GTG AAG ATA ATG ATG GAG ATT TAC AAA GGG AGT GTC ATT TTT 204 
s Lys Ser Glu Asp Asn Asp Gly Asp Leu Gin Arg Glu Cys His Phe Ty 
35 40 45 

ACC TTT GGA AAA CTG ATG AAC CTA TTG AAG ATA GTT TTT ATA ATT ATG 252 
r Leu Trp Lys Thr Asp Glu Pro He Glu Asp Ser Phe Tyr Asn Tyr As 
50 55 60 6 

ATT TAA GTT TTA GAA TTG CAG GAA GTG AAT ATG AGC TTC TTC TGG TAA 300 
p Leu Ser Phe Arg He Ala Gly Ser Glu Tyr Glu Leu Leu Leu Val Me 
5 70 75 80 

TGT TTT TTG CTA CTG ATG AGA TCA ACA AGA ATC CTT ATC TTT TAC- CCA 348 
t Phe Phe Ala Thr Asp Glu He Asn Lys Asn Pro Tyr Leu Leu Pro As. 
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85 90 95 

ACA TGA GTT TGA TGT TCT CCA TCA TTG GTG GAA ACT GTC ATG ATT TAT 396 
n Met Ser Leu Met Phe Ser He He Gly Gly Asn Cys His Asp Leu Le 
100 105 no 

TGA GAA GTC TGG ATC AAG AAT ATG CAC AAA TAG ATG GAC ATA TGA ATT 444 
u Arg Ser Leu Asp Gin Glu Tyr Ala Gin He Asp Gly His Met Asn Ph 
115 120 125 

TTG TTA ATT ATT TCT GTT ATT TAG ATG ATT CAT GTG CCA CAG GCC TTA 492 
e Val Asn Tyr Phe Cys Tyr Leu Asp Asp Ser Cys Ala Thr Gly Leu Th 
130 135 140 1 

CAG GAC CAT CAT GGA AAA CAT CCT TAA AAC TGG CAA TGC ATT CTT CAA 540 
r Gly Pro Ser Trp Lys Thr Ser Leu Lys Leu Ala Met His Ser Ser Me 
45 150 155 160 

TGC CAC TGG TTT TCT TTG GAC CAT TTA ATC CTA ACC TAC GCG ACC ATG 588 
t Pro Leu Val Phe Phe Gly Pro Phe Asn Pro Asn Lsu Arg Asp His As 
165 170 175 

ACC GGC TGC CCC ATG TCC ATC AGG TAG CCC CCA AGG ACA CAC ATT TGT 636 
p Arg Leu Pro His Val His Gin Val Ala Pro Lys Asp Thr His Leu Se 
180 185 190 

CCC ATG GCA TGG TCT CCT TGA TGT TTC ATT TTA GGT GGA CTT GGA TAG 684 
r His Gly Met Val Ser Leu Met Phe His Phe Arg Trp Thr Trp He Gl 
195 200 205 

GAC TGG TCA TCT CAG ATG ATG ATC AGG GTA TTC AGT TTC TCT CAG ATT 732 
y Leu Val He Ser Asp Asp Asp Gin Gly He Gin Phe Leu Ser Asp Le 
210 215 220 2 

TAA GAG AAG AAA GCC AAA GGC ATG GGA . TCT GTT TGG CTT TTG TTA ATA 780 
u Arg Glu Glu Ser Gin Arg His Gly He Cys Leu Ala Phe Val Asn Me 
25 230 235 240 

TGA TCC CAG AAA ACA TGC AGA TAT ACA TGA CAA GGG CTA CAA TAT ATG 82 8 

t He Pro Glu Asn Met Gin He Tyr Met Thr Arg Ala Thr He Tyr As 
245 250 255 

ATA CAC AAA TTA TGA CAT CTT CAG CAA AGG TTG TTA TCA TTT ATG GTG 876 
p Thr Gin He Met Thr Ser Ser Ala Lys Val Val He He Tyr Gly As 
260 265 270 

ACA TGA ACT CTA CTC TAG AAG CAA GCT TTA GAA GAT GGG AAG AGT TAG 924 
p Met Asn Ser Thr Leu Glu Ala Ser Phe Arg Arg Trp Glu Glu Leu Gl 
275 280 285 

GTG CTC GGA GAA TCT GGA TCA CAA CCA CAC AAT GGG ATG TCA TCA CAA 972 
y Ala Arg Arg He Trp He Thr Thr Thr Gin Trp Asp Val He Thr As 
290 295 300 3 

ATA AAA AAA GAC TTC ACC CT TAATCTCTTC CATGGGACTA TTACTTTTGC ACACC 1027 
n Lys Lys Arg Leu His Pro 
05 310 

ACAAAGATGA GATTCCTAAA TTTAGGAATT TTATGCAAAC AAAGAAAACT GCCAAATACC 1087 

TTGTAGATAT TTCTCATACT ATTTTGGAGT GGAATTATTT TAATTGTTCA ATCTCTAAGA 1147 

ACAGCAGTAA AATGGGTCAT TTTACATTCA ACAACACATT GCAATGGACA GCACTGCACA 1207 

ACTATGATAT GGCCCTGAGC GATGAAGGTT ACAATTTGTA TAATGCTGTT TATGCTGTGG 1267 

CCCACACCTA CCATGAATAC ATTCTTCAAC AAGTAGAGTC TCAGAAAAAG GCAAAACCCA 1327 

AAAGATATTT CACTGCTTGT CAGCAGGTTC CCTCCTCTGT GTGTAGTGTG GCATGTACTG 1387 

CAGGATTCAG GAAAATTCAT CAGAAAGAAA CGGCAGATTG CTGCTTTGAT TGTGTTCAGT 1447 



WO 99/00422 



PCT/US98/13680 



-108- 

GCCTAGAAAA TGAGGTTTCC AATGAAACAG ATATGGAACA GTGTGTGAGA TGTCCAGATA 1507 

ATAAATATGC CAATTTAGAG CAAACCCACT GCCTCCAAAG AACGGTGTCA TTTCTGGCTT 1567 

ATGAAGATCC ATTGGGGATG GCTCTAGGCT GCATGGCACT GTCCTTCTCG GCCATCACAA 1627 

TTCTAGTCCT CGTCACATTT GTGAAGTACA AGGATACTCC CATTGTGAAG GCCAATAACC 1687 

GCATTCTCAG CTACATCCTG CTCATCTCTC TCGTCTTCTG CTTTCTCTGT TCCCTGCTCT 1747 

TCATTGGACA TCCCGACCAG GTCACCTGCA TCTTGCAGCA GACCACATTT GGAGTATTGT 1807 

TCACTGTGTC TGTTTCTACA GTGTTGGCCA AAACAATAAC TGTGGTCATG GCTTTCAAGC 1867 

TCACTACTCC AGGAAGAAGG ATGAGAGGGA TGATGATGAC AGGGGCACCT AAGTTGGTCA 1927 

TTCCCATTTG TACCCTGATC CAACTTGTTC TCTGTGGAAT CTGGTTGGTC ACATCTCCTC 198 7 

CCTTTATTGA CAGAGATATA CAATCTGAAC ATGGGAAGAT TGTCATTCTT TGCAATAAAG 2047 

GCTCTGTCGT TGCCTTCCAC GTCGTCCTGG GATACTTGGG CTCCTTGGCT CTGGGGAGCT 2107 

TCACTTTGGC TTTCTTGGCT AGGAACCTTC CTGACACATT CAATGAAGCC AAGTTCCTAA 2167 

CTTTCAGCAT GCTGGTGTTC TGCAGTGTCT GGATCACCTT CCTCCCTGTC TACCACAGCA 2227 

CCAGGGGGAA GGTCATGGTG GTTGTGGAGG TTTTCTCCAT CTTGGCTTCT AGTGCAGGGT 228 7 

TGCTAATGTG TATCTTTGTC CCAAAGTGTT ATGTTATTTT AATTAGACCA GATTCAAATT 234 7 

TTATACAGAA CCACAAAGGT AAATTGCTTT ATTGAAACTT TCATGGTATG AAAATGTTAG 2407 

ATGATATTCA ACTTATCTTA TTCTTCATCT TAATAAAAGC AGTACTTCAT CATATAAAAA 2467 

ATAAAGTAAT ATACAGATTT ATACTTACAA ACTGGACAGC AAACATGAAT ATGTTGAGAA 2527 

CTGGGATTCT CAATTGAGGA ATGGCTACCA ATATTTTGAT CTGTGGTTTT GTGTTTAAGC 2587 

CATGTACTTA AT.TAATGATT AACATGAGGT TACCCTACTG TCTTTGAACA GGGGGAGGTC 2647 

TAGGCATGCT GTCCTTGAGT TATAAGAAAG GGTACTGCAT ACACAATGGA CATGAAGCCA 2707 

GTAATCAACA TTATTCCACT TGCTTTCATG GAGTTCTTAC TTCCAAGTTC ATGCCTTGAC 2767 

TTTATTCAAT GTTCTATGAC AAAGGTAGAA TAAATAAATA AACACTTTCC TCAC 2821 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 
<A) LENGTH: 311 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 



Met 


Lys 


Gin 


Leu 


Cys 


Thr 


Phe 


Thr 


He 


Ser 


Leu 


Leu 


Phe 


Leu 


Lys 


Phe 


1 








5 










10 










15 




Ser 


Leu 


lie 


Leu 


Cys 


Cys 


Trp 


Ser 


Glu 


Pro 


Ser 


Cys 


Phe 


Trp 


Arg 


He 








20 










25 










30 




L V S 


Lys 


Ser 


Glu 


Asp 


Asn 


Asp 


Gly 


Asp 


Leu 


Gin 


Arg 


Glu 


Cys 


His 


Phe 






35 










40 








45 






Tyr 


Leu 


Trp 


Lys 


Thr 


Asp 


Glu 


Pro 


He 


Glu 


Asp 


Ser 


Phe 


Tyr 


Asn 


Tyr 




50 










55 










60 






Asp 


Leu 


Ser 


Phe 


Arg 


He 


Ala 


Gly 


Ser 


Glu 


Tyr 


Glu 


Leu 


Leu 


Leu 


Val 


65 










70 










75 










80 


Met 


Phe 


Phe 


Ala 


Thr 


Asp 


Glu 


He 


Asn 


Lys 


Asn 


Pro 


Tyr 


Leu 


Leu 


Pro 










85 










90 








95 




Asn 


Met 


Ser 


Leu 


Met 


Phe 


Ser 


He 


He 


Gly 


Gly 


Asn 


Cys 


His 


Asp 


Leu 








100 










105 








110 




Leu 


Arg 


Ser 
115 


Leu 


Asp 


Gin 


Glu 


Tyr 
120 


Ala 


Gin 


He 


Asp 


Gly 
125 


His 


Met 


Asn 


Phe 


Val 


Asn 


Tyr 


Phe 


Cys 


Tyr 


Leu 


Asp 


Asp 


Ser 


Cys 


Ala 


Thr 


Gly 


Leu 




130 










135 








140 








Thr 


Gly 


Pro 


Ser 


Trp 


Lys 


Thr 


Ser 


Leu 


Lys 


Leu 


Ala 


Met 


His 


Ser 


Ser 


145 










150 










155 










160 


Met 


Pro 


Leu 


Val 


Phe 
165 


Phe 


Gly 


Pro 


Phe 


Asn 
170 


Pro 


Asn 


Leu 


Arg 


Asp 
175 


His 


Asp 


Arg 


Leu 


Pro 
180 


His 


Val 


His 


Gin 


Val 
185 


Ala 


Pro 


Lys 


Asp 


Thr 
190 


His 


Leu 


Ser 


His 


Gly 


Met 


Val 


Ser 


Leu 


Met 


Phe 


His 


Phe 


Arg 


Trp 


Thr 


Trp 


He 






195 










200 










205 






Gly 


Leu 


Val 


He 


Ser 


Asp 


Asp 


Asp 


Gin 


Gly 


He 


Gin 


Phe 


Leu 


Ser 


Asp 



210 215 220 
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Leu 


Arg 


Glu 


Glu 


Ser 


Gin 


Arg 


His 


Gly 


He 


Cys 


Leu Ala 


Phe 


Val 


Asn 


225 










230 










235 








240 


Met 


lie 


Pro 


Glu 


Asn 


Met 


Gin 


He 


Tyr 


Met 


Thr 


Arg Ala 


Thr 


He 


Tyr 










245 










250 








255 


Asp 


Thr 


Gin 


lie 


Met 


Thr 


Ser 


Ser 


Ala 


Lys 


Val 


Val lie 


He 


Tyr 


Gly 








260 










265 








270 


Asp 


Met 


Asn 
275 


Ser 


Thr 


Leu 


Glu 


Ala 
280 


Ser 


Phe 


Arg 


Arg Trp 
285 


Glu 


Glu 


Leu 


Gly 


Ala 
290 


Arg 


Arg 


He 


Trp 


He 
295 


Thr 


Thr 


Thr 


Gin 


Trp Asp 
300 


Val 


He 


Thr 


Asn 


Lys 


Lys 


Arg 


Leu 


His 


Pro 


















305 










310 





















(2) INFORMATION FOR SEQ ID NO:25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 773 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY r linear 

( i i ) MOLECULE . TYPE : cDNA 
(ix) FEATURE: 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 3... 1238 

(D) OTHER INFORMATION: VR13 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 

AA GCA AGT TGC TTT TGG CGG ATA AAG AAT AGT GAA GAT AAT GAT GGA 47 
Ala Ser Cys Phe Trp Arg He Lys Asn Ser Glu Asp Asn Asp Gly 
15 10 15 

GAT TTG CAA AGG GAA TGT CAT TTT TAC CTT GGG GCA GTT GAT AAA CCA 95 
Asp Leu Gin Arg Glu Cys His Phe Tyr Leu Gly Ala Val Asp Lys Pro 
20 25 30 

ATT GAA GAT AAT TTT TAT AAT TCA CTT TTA AAG TTT AGA ATT GCA GCA 143 
He Glu Asp Asn Phe Tyr Asn Ser Leu Leu Lys Phe Arg He Ala Ala 
35 40 45 

AGT GAA TAT GAG TTT CTT CTG GTA ATG TTT TTT GCT ACT GAT GAG ATC 191 
Ser Glu Tyr Glu Phe Leu Leu Val Met Phe Phe Ala Thr Asp Glu He 
50 55 60 

AAC AAG AAT CCT TAT CTT TTA CCC AAC ATA ACT TTG ATG TTC TCC ATC 239 
Asn Lys Asn Pro Tyr Leu Leu Pro Asn He Thr Leu Met Phe Ser He 
65 70 75 

ATT GGT GGA AAC TGT CAT GAT TTA TTG AGA GGT TTG GAT CAA GCA TAT 287 
He Gly Gly Asn Cys His Asp Leu Leu Arg Gly Leu Asp Gin Ala Tyr 
80 85 90 95 

ACA CAA ATA AAT GGA CAT ATG AAT TTT GTT AAT TAT TTC TGT TAT TTA 335 
Thr Gin He Asn Gly His Met Asn Phe Val Asn Tyr Phe Cys Tyr Leu 
100 105 110 

GAT GAT TCA TGT GCC ATA GGT CTT ACA GGA CCA TCA TGG AAA ACA TCC 383 
Asp Asp Ser Cys Ala He Gly Leu Thr Gly Pro Ser Trp Lys Thr Ser 
115 120 125 



TTA AAA CTG GCA ATG CAT TCT TCA ATG CCA CTG GTT TTC TTT GGA TCA 
Leu Lys Leu Ala Met His Ser Ser Met Pro Leu Val Phe Phe Gly Ser 



431 
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130 135 140 

TTT AAT CCT AAC CTA CAT GAC CAT GAC CGG CTG CAC CAT GTC CAT CAA 479 
Phe Asn Pro Asn Leu His Asp His Asp Arg Leu His His Val His Gin 
145 150 155 

GTA GCC ACC AAG GAC ACA CAT TTG TCC CAT GGC ATT GTC TCC TTG ATG 527 
Val Ala Thr Lys Asp Thr His Leu Ser His Gly lie Val Ser Leu Met 
160 165 170 175 

TTT CAT TTT AGA TGG ACT TGG ATA GGA CTG GTC ATC TCA GAT GAT GAC 575 
Phe His Phe Arg Trp Thr Trp lie Gly Leu Val lie Ser Asp Asp Asp 
180 185 190 

AAG GGT ATT CAG TTT CTC TCA GAT TTA AGA GAA GAA AGC CAA AGG CAT 623 
Lys Gly lie Gin Phe Leu Ser Asp Leu Arg Glu Glu Ser Gin Arg His 
195 200 205 

GGG ATC TGT TTA GCT TTT GTT AAT ATG ATC CCA GAA AAC ATG CAG ATA 671 
Gly lie Cys Leu Ala Phe Val Asn Met lie Pro Glu Asn Met Gin lie 
210 215 220 

TAC ATG ACA AGG GCT ACA ATA TAT GAT AAA CAA ATT ATG ACG TCT TTA 719 
Tyr Met Thr Arg Ala Thr lie Tyr Asp Lys Gin lie Met Thr Ser Leu 
225 230 235 

GCA AAA GTT GTT ATC ATT TAT GGT GAA ATG AAC TCT ACA CTA GAA GTA 767 
Ala Lys Val Val lie lie Tyr Gly Glu Met Asn Ser Thr Leu Glu Val 
240 245 250 255 

AGC TTT AGA AGA TGG GAA AAT TTA GGT GCT CGG AGA ATC TGG ATC ACA 815 
Ser Phe Arg Arg Trp Glu Asn Leu Gly Ala Arg Arg lie Trp lie Thr 
260 265 270 

ACC TCA CAA TGG GAT GTC ATC ACA AAT AAA AAA GAA TTC ACC CTT AAT 863 
Thr Ser Gin Trp Asp Val lie Thr Asn Lys Lys Glu Phe Thr Leu Asn 
275 280 285 

CTC TTC CAT GGG ACT ATT ACT TTT GCA CAC CGC AGA TTT GAG ATT CCT 911 
Leu Phe His Gly Thr lie Thr Phe Ala His Arg Arg Phe Glu He Pro 
290 295 300 

AAA TTT AAA AAA TTT ATG CAA ACA ATG AAC ACT GCC AAA TAC CCA GTA 959 
Lys Phe Lys Lys Phe Met Gin Thr Met Asn Thr Ala Lys Tyr Pro Val 
305 310 315 

GAT ATT TCT CAT ACT ATA TTG GAG TGG AAT TAT TTT AAT TGT TCA ATC 1007 
Asp He Ser His Thr He Leu Glu Trp Asn Tyr Phe Asn Cys Ser He 
320 325 330 335 

TCT AAG AAC AGC AGT AAA ATG GAT CAT ATT ACA TTC AAC AAC ACA TTG 1055 
Ser Lys Asn Ser Ser Lys Met Asp His He Thr Phe Asn Asn Thr Leu 
340 345 350 

GAA TGG ACA GCA CTG CAC AAC TAT GAT ATG GTG ATG AGT GAT GAA GGT 1103 
Glu Trp Thr Ala Leu His Asn Tyr Asp Met Val Met Ser Asp Glu Gly 
355 360 365 

TAC AAT TTG TAT AAT GCT GTT TAT GCT GTG GCC CAC ACC TAC CAT GAA 1151 
Tyr Asn Leu Tyr Asn Ala Val Tyr Ala Val Ala His Thr Tyr His Glu 
370 375 380 

CAT ATT TTT CAA CAA GTA GAG TCT CAG AAA AAG GCA AAA CCC AAA AGA 1199 
His He Phe Gin Gin Val Glu Ser Gin Lys Lys Ala Lys Pro Lys Arg 
385 390 395 
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TTT TTC ACT GTT TGT CAG CAG CAG ATA TGG AAC AGT GTG TGAAGTGTCC AT 12 50 
Phe Phe Thr Val Cys Gin Gin Gin lie Trp Asn Ser Val 
400 405 410 

ATGATAAGTA TGCCAACATA GAGAAAACCC ACTGCCTCTC AAGAGCTGTA TCATTTCTGG 1310 

CTTATGAAGA TCCATTGGGG ATAGCTCTAG GCTGCATAGC ACTGTCCTTC TCAGCCATCA 1370 

CAATTCTAGT ACTAATCACA TTTTTGAAGT ACAAGGATAC TCCCATTGTG AAGGCCAATA 143 0 

ACCGCATTCT CAGCTACATC CTGCTCATCT CTCTAGTCTT CTGCTTTCTC TGCTCCCTGC 1490 

TCTTCATTGG ACATCCAAAC CAGGTCTCCT GCGTCTTGCA GCAGACCACA TTTGGAGTAT 1550 

TTTTCACTGT GTCTGTTTCT ACAGTGTTGG CCAAAACAAT AACTGTGGTC ATGGCTTTCA 1610 

AGCTCACTAC TCCAGGAAGA AGAATGAGAG AGATGTTGGT AACAGGGGCA CCTAAGTTGG 1670 

TCATTCCCAT TTGTACCCTA ATCCAATTTG TTCTCTGTGG AATCTGGTTG ATAACATCTC 1730 

CTCCATTTAT TGACAGAGAT ATACAATCTG AGCATGGGAA GATTGTCATT CTTTGCAATA 1790 

AAGGCTCTGT CATTGCCTTC CATGTTGTCC TGGGATACTT GGGCTCCTTG GCTCTGGGGA 1850 

GCTTCACTTT GGCTTTCTTG GCTAGGAACC TTCCTGACAC ATTCAATGAA GCCAAATTCC 1910 

TGACTTTCAG CATGCTGGTG TTCTGCAGTG TCTGGATCAC CTTTCTCCCT GTCTACCATA 1970 

GCACCAGGGG GAAGGTCATG GTGGTTGTGG AGGTTTTCTC AATCTTGGCT TCTAGTGCAG 2030 

GGTTGCTAAT GTGTATCTTT GTCCCAAAGT GTTATGTTAT TTTAGTTAGA CCAGATTCAA 2 090 

ATTTTATACG GAAGTACAAA GATAAATTTC GTTATTGAAA TATTCATACT ATGAAAATGT 2150 

T AGA TTATAC TCAACATATT TTTCTTTGTC TTAACAAAAG TAGTACTTAA TCTTATAAAA 2210 

ATTTAAATAA TAT ACAAATT TGAACTTACA AACAGGACAG AACTGTCTAT TGTAATACCA 2270 

ATTACAAAAC TTTGGTGAAA AATGGTCTCA TTCATAAGGA CACAATTCTG AAGATATTGA 2330 

GAACCAGGAA TCTCAACTGC GGAAACGCTA CCATCATCCT GACCTGTGGT TTTGTGTGTA 2390 

AAGCATGAAC TTAATTAATG ATTAATATAA GGTGACCATA CTGACTGTGA ACACTACCAT 2450 

CTCTGGGCAA GTTGTTCTTG TAGTTGTAAG AAAAAGCTCT GAAGACAACA TGGAAGTAAA 2510 

GCCAGTAATC ACCATTATCC CTCATGCTTT CATGGAGTGG CTGCATCCAA TTTCATGCCT 2570 

TGGCTTCATT CAATATACTG TGACCAAGGT ACATAAGTAA AGAAACACTT TTCTTACAAG 263 0 

CTTCTTCTGA TCGTTGTGGG TTTTTTTGTT TTTTGTTTTT TGTTTTTTGT TTGTTTGTTT 2690 

GTATTTTTAC ATCAACGGAA TTTAAAATAT CAACAAAATG GTAAATTGTT TCTGTTGAGA 2750 

TTTAGAATAT CATCGATTCC TGA 2773 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 412 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 



Ala 


Ser 


Cys 


Phe 


Trp 


Arg 


He 


Lys 


Asn 


Ser 


Glu 


Asp 


Asn 


Asp Gly Asp 


l 








5 










10 










15 




Leu 


Gin 


Arg 


Glu 


Cys 


His 


Phe 


Tyr 


Leu 


Gly 


Ala 


Val 


Asp 


Lys 


Pro 


He 








20 










25 








30 






Glu 


Asp 


Asn 


Phe 


Tyr 


Asn 


Ser 


Leu 


Leu 


Lys 


Phe 


Arg 


He 


Ala 


Ala 


Ser 






35 










40 








45 








Glu 


Tyr 


Glu 


Phe 


Leu 


Leu 


Val 


Met 


Phe 


Phe 


Ala 


Thr Asp 


Glu 


He 


Asn 




50 










55 










60 










Lys 


Asn 


Pro 


Tyr 


Leu 


Leu 


Pro 


Asn 


He 


Thr 


Leu 


Met 


Phe 


Ser 


He 


He 


65 










70 










75 










80 


Gly 


Gly 


Asn 


Cys 


His 
85 


Asp 


Leu 


Leu 


Arg 


Gly 
90 


Leu 


Asp 


Gin 


Ala 


Tyr 
95 


Thr 


Gin 


He 


Asn 


Gly 


His 


Met 


Asn 


Phe 


Val 


Asn 


Tyr 


Phe 


Cys 


Tyr 


Leu 


Asp 








100 










105 










110 




Asp 


Ser 


Cys 
115 


Ala 


He 


Gly 


Leu 


Thr 
120 


Gly 


Pro 


Ser 


Trp 


Lys 
125 


Thr 


Ser 


Leu 


Lys 


Leu 


Ala 


Met 


His 


Ser 


Ser 


Met 


Pro 


Leu 


Val 


Phe 


Phe 


Gly Ser 


Phe 




130 










135 










140 










Asn 


Pro 


Asn 


Leu 


His 


Asp 


His 


Asp 


Arg 


Leu 


His 


His 


Val 


His 


Gin 


Val 


145 










150 










155 










160 


Ala 


Thr 


Lys 


Asp 


Thr 


His 


Leu 


Ser 


His 


Gly 


He 


Val 


Ser 


Leu 


Met 


Phe 
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165 170 175 



His 


Phe 


Arg 


Trp 


Thr 


Trp 


He 


Gly 


Leu 


Val 


He 


Ser Asp Asp Asp 


Lys 








180 










185 










190 






Gly 


He 


Gin 


Phe 


Leu 


Ser 


ASD 


Leu 


Ara 


Glu 


Glu Ser Gin Arg His 


Glv 






195 










200 










one 








lie 


Cys 


Leu 


Ala 


Phe 


Val 


Asn 


Met 


He 


Pro 


IjIU 


Asn 


wee 


bin 


lie 


i. yi 




210 










215 










220 








Met 


Thr 


Arg 


Ala 


Thr 


He 


Tvr 


Asn 


Lvs 


Gin 


lie 






Ser 


Leu 


Ala 


225 










230 




















240 


Lys 


Val 


Val 


He 


He 


Tvr 


Glv 


Glu 


Met 


Asn 


Ser 


inr 


Lieu 


GlU 


val 


OCX 










245 










250 










255 




Phe 


Arg 


Arcr 


Trp Glu 


Asn 


Leu 


Glv 


Ala 


Arcr 


Arg 


lie 


Trp 


lie 


inr 










260 










265 










270 






Ser 


Gin 


TrD 


Asp 


Val 


He 


Thr 


Asn 


Lys 


Lys 


Glu 


Phe 


Thr 


Leu 


Asn 


Leu 






275 










280 










285 








Phe 


His 


Glv 


Thr 


He 


Thr 


Phe 


Ala 


His 




Arg 


Phe 


Glu 


He 


Pro 


IJ/O 




290 










295 










300 








Phe 


Lys 


Lys 


Phe 


Met 


Gin 


Thr 


Met 


Asn 


Thr 


Ala 


Lys 


Tyr 


Pro 


Val 


Asp 


305 










310 










315 








320 


He 


Ser 


His 


Thr 


He 


Leu 


Glu 


Trp 


Asn 


Tyr 


Phe 


Asn 


Cys 


Ser 


He 


Ser 










325 










330 








335 




Lys 


Asn 


Ser 


Ser 


Lys 


Met 


Asp 


His 


He 


Thr 


Phe 


Asn 


Asn 


Thr 


Leu 


Glu 








340 










345 










350 






Trp 


Thr 


Ala 


Leu 


His 


Asn 


Tyr 


Asp 


Met 


Val 


Met 


Ser 


Asp 


Glu 


Gly 


Tyr 






355 










360 










365 




Asn 


Leu 


Tyr 


Asn 


Ala 


Val 


Tyr 


Ala 


Val 


Ala 


His 


Thr 


Tyr 


His 


Glu 


His 




370 










375 










380 










He 


Phe 


Gin 


Gin 


Val 


Glu 


Ser 


Gin 


Lys 


Lys 


Ala 


Lys 


Pro 


Lys 


Arg 


Phe 


385 










390 










395 










400 


Phe 


Thr 


Val 


Cys 


Gin 


Gin 


Gin 


He 


Trp 


Asn 


Ser 


Val 











405 410 
(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3108 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(ix) FEATURE: 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 116... 2527 
(D) OTHER INFORMATION: VR14 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: 

GAATATGCAA TAAACATCTC CTTTGCCTAA AGAAATAAAA GCTGGTAGAA ATCTGATGTG 60 
CTGATATGCA TGGCACTTCA CAATCCACAC TGCCCAGGTT TAAGGCAGGA AAAAG ATG 118 

Met 
1 

TTC ATT TTC ATG GAA GTC TTC TTC CTC CTT AAT ATT ACA CTT CTC ATG 166 
Phe He Phe Met Glu Val Phe Phe Leu Leu Asn He Thr Leu Leu Met 
5 10 15 

GCC AAT TTC ATT GAT CCC AGG TGC TTT TGG AGA ATA AAT TTG GAT GAA 214 
Ala Asn Phe He Asp Pro Arg Cys Phe Trp Arg He Asn Leu Asp Glu 
20 25 30 



ATA ATG GAT GAA TAT TTG GGA TTA TCT TGT GCT 
He Met Asp Glu Tyr Leu Gly Leu Ser Cys Ala 



TTC ATC 
Phe He 



CTG GCA GCA 
Leu Ala Ala 



262 
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35 40 45 

GTT CAG ACA CCC ATT GAA AAT GAT TAT TTC AAC AAG ACT CTT AAT GTT 310 
Val Gin Thr Pro lie Glu Asn Asp Tyr Phe Asn Lys Thr Leu Asn Val 
50 55 60 65 

CTA AAA ACA ACT AAA AAC CAC AAA TAT GCT TTG GCA TTG GTG TTT GCA 358 
Leu Lys Thr Thr Lys Asn His Lys Tyr Ala Leu Ala Leu Val Phe Ala 
70 75 80 

ATG GAT GAA ATC AAC AGA AAT CCT GAT CTT TTA CCA AAT ATG TCT TTG 406 
Met Asp Glu lie Asn Arg Asn Pro Asp Leu Leu Pro Asn Met Ser Leu 
85 90 95 

ATT ATA AGA TAC ACT TTG GGC CGT TGT GAT GGA AAA ACT GTA ATA CCT 454 
lie He Arg Tyr Thr Leu Gly Arg Cys Asp Gly Lys Thr Val He Pro 
100 105 no 

ACA CCA TAT TTA TTT CGT AAA AAA AAA GAA AGC CCT ATC CCT AAT TAT 502 
Thr Pro Tyr Leu Phe Arg Lys Lys Lys Glu Ser Pro He Pro Asn T*t 
115 ~* 120 ' " 125 

TTC TGT AAT GAA GAG ACT ATG TGT TCC TAT CTG CTT ACA . GGA CCC CAT 550 
Phe Cys Asn Glu Glu Thr Met Cys Ser Tyr Leu Leu Thr Gly Pro His 
130 135 140 145 

TGG GAG GTA TCT TTA GGT TTC TGG AAG CAC ATG AAC AGC TTC TTA TCT 598 
Trp Glu Val Ser Leu Gly Phe Trp Lys His Met Asn Ser Phe Leu Ser 
150 155 160 

CCA CGT ATC CTT CAG CTT ACC TAT GGA CCT TTC CAC TCC ATC TTC AGT 646 
Pro Arg He Leu Gin Leu Thr Tyr Gly Pro Phe His Ser He Phe Ser 
165 170 175 

GAT GAT GAA CAA TAT CCC TAT CTC TAT CAG ATG GCC CCA AAG GAC ACA 694 
Asp Asp Glu Gin Tyr Pro Tyr Leu Tyr Gin Met Ala Pro Lys Asp Thr 
180 185 190 

TCT CTA GCA TTG GCA ATG GTC TCC TTC ATA CTT TAC TTT AGC TGG AAC 742 
Ser Leu Ala Leu Ala Met Val Ser Phe He Leu Tyr Phe Ser Trp Asn 
195 200 205 

TGG ATT GGC CTT GTC ATT CCA GAT GAT GAC CAA GGA AAC CAA TTT CTT 790 
Trp He Gly Leu Val He Pro Asp Asp Asp Gin Gly Asn Gin Phe Leu 
210 215 220 225 

TTA GAG TTG AAG AAA CAG AGT GAA AAC AAG GAA ATT TGC TTT GCC TTT 838 
Leu Glu Leu Lys Lys Gin Ser Glu Asn Lys Glu He Cys Phe Ala Phe 
230 235 240 

GTG AAA ATG ATC TCT GTT GAT GAT GTT TCA TTT CCA CAA AAT ACT GAA 886 
Val Lys Met He Ser Val Asp Asp Val Ser Phe Pro Gin Asn Thr Glu 
245 250 255 

ATG TAC TAC AAC CAA ATT GTG ATG TCA TCC ACA AAT GTT ATT ATC ATT 934 
Met Tyr Tyr Asn Gin He Val Met Ser Ser Thr Asn Val He He He 
260 265 270 

TAT GGA GAA ACA TAC AAT TTC ATT GAT TTG ATC TTC AGA ATG TGG GAA 982 
Tyr Gly Glu Thr Tyr Asn Phe He Asp Leu He Phe Arg Met Trp Glu 
275 280 285 

CCT CCC ATT TTA CAG AGA ATA TGG ATC ACC ACA AAA CAA TTG AAT TTC 1030 
Pro Pro He Leu Gin Arg He Trp He Thr Thr Lys Gin Leu Asn Phe 
290 295 300 305 . 
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CCT ACC AGG AAA AAA GAC ATA AGT CAT GGC ACA TTC TAT GGA TCA CTT 
Pro Thr Arg Lys Lys Asp lie Ser His Gly Thr Phe Tyr Gly Ser Leu 
310 315 320 



1078 



ACT TTT CTA CCC CAC CAT GGT GTG ATT TCT GGT TTT AAA AAT TTT GTA 
Thr Phe Leu Pro His His Gly Val lie Ser Gly Phe Lys Asn Phe Val 
325 330 335 



1126 



CAG ACA TGG TTC CAT CTC AGA AAC ACA GAT TTA TAT CTA GTA ATG CAA 
Gin Thr Trp Phe His Leu Arg Asn Thr Asp Leu Tyr Leu Val Met Gin 
340 345 350 



1174 



GAG TGG AAA TAC TTT AAC TAT GAA GAC TCA GCA TCT ACC TGT AAA ATA 
Glu Trp Lys Tyr Phe Asn Tyr Glu Asp Ser Ala Ser Thr Cys Lys lie 
355 360 365 



1222 



CTG AAG AAC AAT TCA TCT AAT GCC TCA TTT GAT TGG CTA ATG GAA CAG 
Leu Lys Asn Asn Ser Ser Asn Ala Ser Phe Asp Trp Leu Met Glu Gin 
370 375 380 385 



1270 



AAG TTT GAC ATG ACC TTT AGT GAG AAT AGT CAT AAC ATA TAC AAT GCT 
Lys Phe Asp Met Thr Phe Ser Glu Asn Ser His Asn lie Tyr Asn Ala 
390 395 400 



1318 



GTG CAT GCC ATA GCC CAT GCC CTC CAT GAG ATG AAT CTG CAA CAG GCT 
Val His Ala lie Ala His Ala Leu His Glu Met Asn Leu Gin Gin Ala 
405 410 415 



1366 



GAT AAT CAG GCA ATA GAC AAT GGG AAA AAG GAG CCC AGT TCC TCC CAC 
Asp Asn Gin Ala lie Asp Asn Gly Lys Lys Glu Pro Ser Ser Ser His 
420 425 430 



1414 



TGC TTG AAG GTA AAC TCC TTT CTA AGA AGG ATT TAC TTC ACT AAT CCT 
Cys Leu Lys Val Asn Ser Phe Leu Arg Arg lie Tyr Phe Thr Asn Pro 
435 440 445 



1462 



CCT GGG GAC AAA GTG TTT ATG AAG CAA AGA GTA ATA ATG CAC GAT GAA 
Pro Gly Asp Lys Val Phe Met Lys Gin Arg Val lie Met His Asp Glu 
450 455 460 465 



1510 



TAT GAC ATT GTT CAC TTT GTG AAT CTC TCA CAA CAC CTT GGG ATT AAG 
Tyr Asp lie Val His Phe Val Asn Leu Ser Gin His Leu Gly lie Lys 
470 475 480 



1558 



ATG AAG TTA GGA AAG TTC AGC CCA TAT TTA CCA CAT GGT CGA CAC TCT 
Met Lys Leu Gly Lys Phe Ser Pro Tyr Leu Pro His Gly Arg His Ser 
485 490 495 



1606 



CAC TTA TAT GTA GAC AGG ATT GAG TTG GCC ACA GGA AGA AGA AAG ATG 
His Leu Tyr Val Asp Arg lie Glu Leu Ala Thr Gly Arg Arg Lys Met 
500 505 510 



1654 



CCA TCC TCT GTG TGC AGT GCT GAT TGT AGT CCT GGA TTC AGA AGA TTA 
Pro Ser Ser Val Cys Ser Ala Asp Cys Ser Pro Gly Phe Arg Arg Leu 
515 520 525 



1702 



TGG AAG GAG GGA ATG GCA GCC TGC TGT TTT GTT TGC AGC CCC TGC CCT 
Trp Lys Glu Gly Met Ala Ala Cys Cys Phe Val Cys Ser Pro Cys Pro 
530 535 540 545 



1750 



GAA AAT GAA ATT TCT AAT GAG ACA ACT GTG GTA CTT TGT GTC TTT GTG 
Glu Asn Glu He Ser Asn Glu Thr Thr Val Val Leu Cys Val Phe Val 
550 555 560 



1798 



AAG CAT CAT GAC ACT CCT ATT GTG AAG GCC AAT AAC AGA AGC CTC AGC 



1846 
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Lys His His Asp Thr Pro lie Val Lys Ala Asn Asn Arg Ser Leu Ser 
565 570 575 

TAC CTA TTA CTC ATG TCA CTC ATG TCC TGT TTT CTG TGC TCC TTT TTC 1894 
Tyr Leu Leu Leu Met Ser Leu Met Ser Cys Phe Leu Cys Ser Phe Phe 
580 585 590 

TTC ATT GGC CTT CCA AAC AGA GCC ATC TGT GTC TTA CAG CAA ATC ACA 1942 
Phe lie Gly Leu Pro Asn Arg Ala He Cys Val Leu Gin Gin He Thr 
595 600 605 

TTT GGA ATT GTA TTC ACT ATG GCT GTT TCC ACA GTT CTG GCC AAA ACA 1990 
Phe Gly He Val Phe Thr Met Ala Val Ser Thr Val Leu Ala Lys Thr 
610 615 620 625 

GTC ACT GTG GTT CTG GCT TTC AAA GTC ACA GAC CCA GGA AGA AGA TTG 2038 
Val Thr Val Val Leu Ala Phe Lys Val Thr Asp Pro Gly Arg Arg Leu 
630 635 640 

AGA AAC TTC CTG GTA TCA GGA ACA CCC AAC TAC ATT ATT CCC ATA TGT 2086 
Arg Asn Phe Leu Val Ser Gly Thr Pro Asn Tyr He He Pro He Cys 
645 650 655 

TCC CTA CTC CAA TGT GTT CTG TGT GCA ATC TGG CTA GCA GTT TCT CCT 2134 
Ser Leu Leu Gin Cys Val Leu Cys Ala He Trp Leu Ala Val Ser Pro 
660 665 670 

CCC TTT GTT GAT ATT GAT GAA CAC ACT CTC CAT GGC CAC ATC ATC ATT 2182 
Pro Phe Val Asp He Asp Glu His Thr Leu His Gly His He He He 
675 680 685 

GTG TGC AAC AAG GGC TCA GTT ACT GCA TTC TAC TGT ATC CTA GGA TAC 223 0 
Val Cys Asn Lys Gly Ser Val Thr Ala Phe Tyr Cys He Leu Gly Tyr 
690 695 700 705 

TTG GCC TGC CTG GCA CTT GGA AAC TTC TCT GTG GCT TTC TTG GCC AAG 2278 
Leu Ala Cys Leu Ala Leu Gly Asn Phe Ser Val Ala Phe Leu Ala Lys 
710 715 720 

AAT CTG CCT GAC ACA TTC AAT GAA GCC AAG TTC TTG ACC TTC AGC ATG 2326 
Asn Leu Pro Asp Thr Phe Asn Glu Ala Lys Phe Leu Thr Phe Ser Met 
725 730 735 

CTA GTG TTC TGT AGT GTC TGG GTC ACC TTC CTC CCT GTC TAC CAT AGC 2374 
Leu Val Phe Cys Ser Val Trp Val Thr Phe Leu Pro Val Tyr His Ser 
740 745 750 

ACC AAG GGC AAA CAC ATG GTT GCT GTG GAG ATC TTC TCC ATC TTG GCA 2422 
Thr Lys Gly Lys His Met Val Ala Val Glu He Phe Ser He Leu Ala 
755 760 765 

TCC AGT GCT GGG ATC CTT GGA TGT ATA TTT GTA CCC AAG ATT TAT ATC 2470 
Ser Ser Ala Gly He Leu Gly Cys He Phe Val Pro Lys He Tyr He 
770 775 780 785 

ATT TTA ATG AGA CCA GAG AGA AAT TCG ACC CAA AAG ATC AGG GAA AAA 2518 
He Leu Met Arg Pro Glu Arg Asn Ser Thr Gin Lys He Arg Glu Lys 
790 795 800 

TCA TAT TTC TGAACAAATA TTTAGGAATT CTGTCAAATG TAAAGTTGGT ACATACCCA 2576 
Ser Tyr Phe 



CCAAATATTG GGTTATAGTG CATGTGTCTA GTTTTAGAAT CACTCTCACT GGTTGCTCTA 2636 
GTGATATCAG CAAGTATCAT ATCTACTGAA CTTCCCTACA GTGTCCATAA AATCTTGTAC 2696 
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TCATTCACTT TCTTCATTTT CTCTCAGAGA ACTAAACTCT CTAATTATTA CAATTTTATT 2756 

CTTCATTTTG CTTTCATGGA GATTGCCCTC TGGTAACTTC CAAAAAATGT TGATAAGGCA 28X6 

GTTGAATCCA CCACTTTGTG TAGAAAAATG AGATCTAGGA AGACAGGGTT ACACATAAAA 2876 

ACCATCTACC AAAATAAATA AT CAATGAGA AACACAGACT AACTAAATAA TCAGCAAAGA 2936 

TGAAATCAGA ACATATTTTC TAATTTCCAG TAAGAGCACA CACATAAGAA AATACTTACT 2996 

TTTTTCATCT GTTCTTCAAT CTACTGGCCA ATAGTCTAAG GAGGAAATGT TCCTTTTCTG 3 056 

CTGTCAAATA AAAATATATT ATATCCAAAA AAAAAAAAAA AAAAAAAAAA AA 3108 

(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 804 amino acids 

(B) TYPE: amino acid 

(C> STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: 



Met 


Phe 


Tip 








VctX 


Phe 


Phe 


Leu 


Leu 


Asn 


T 1 Ok 
1 16 


xnr 


Leu 


Leu 


1 








5 










10 










15 




Met 


Ala 


Asn 


it X16 

20 


Tip 
lie 


Asp 




Arg 


Cys 
25 


jfne 


Trp 


Arg 


116 


Asn 
30 


T A** 

Lieu 


ASp 


Glu 


He 


Ma 4- 

net 


Asp 


ulu 


Tyr 


Leu 


Gly Leu 


Ser 


Cys 


AJ.a 


pne 


TT a 

lie 


T All 

lieu 


Aia 
















40 


















Ala 


Val 




inr 


Pro 


T~l a 

lie 


ulU 


Asn Asp 


Tyr 


Pne 


Asn 


Lys 


Thr 


Leu 


Asn 




50 




















c t\ 
D U 










Val 


Leu 


Lys 


inr 


inr 


Lys 


Asn 


His 


Lys 


Tyr 




T ai * 

Leu 


Ala 


Leu 


val 


pne 


65 










f V 










"7 R 












Ala 


Met 


Asp 




lie 

85 


Asn 


Arg 


Asn 


Pro 


Asp 
90 


Leu 


Leu 


Pro 


Asn 


MGl 

95 


Ser 


Leu 


He 


He 


Arg 


Tyr 


Thr 


Leu 


Gly Arg 


Cys 


Asp 


Gly 


Lys 


Thr 


Val 


He 








100 










105 










110 






Pro 


Thr 


Pro 
115 


Tyr 


Leu 


Phe 


Arg 


Lys 
120 


Lys 


Lys 


Glu 


Ser 


Pro 
125 


lie 


Pro 


Asn 


Tyr 


Phe 
130 


Cys 


Asn 


Glu 


Glu 


Thr 
135 


Met 


Cys 


Ser 


Tyr 


Leu 
140 


Leu 


Thr 


Gly 


Pro 


His 


Trp 


Glu 


Val 


Ser 


Leu 


Gly 


Phe 


Trp 


Lys 


His 


Met 


Asn 


Ser 


Phe 


Leu 


145 










150 










155 










160 


Ser 


Pro 


Arg 


He 


Leu 
165 


Gin 


Leu 


Thr 


Tyr 


Gly 
170 


Pro 


Phe 


His 


Ser 


He 
175 


Phe 


Ser 


Asp 


Asp 


Glu 
180 


Gin 


Tyr 


Pro 


Tyr 


Leu 
185 


Tyr 


Gin 


Met 


Ala 


Pro 
190 


Lys 


Asp 


Thr 


Ser 


Leu 
195 


Ala 


Leu 


Ala 


Met 


Val 
200 


Ser 


Phe 


He 


Leu 


Tyr 
205 


Phe 


Ser 


Trp 


Asn Trp 


He 


Gly 


Leu 


Val 


He 


Pro Asp 


Asp 


Asp 


Gin 


Gly 


Asn 


Gin 


Phe 




210 










215 










220 










Leu 


Leu 


Glu 


Leu 


Lys 


Lys 


Gin 


Ser 


Glu 


Asn 


Lys 


Glu 


He 


Cys 


Phe 


Ala 


225 










230 










235 










240 


Phe 


Val 


Lys 


Met 


He 


Ser 


Val 


Asp Asp 


Val 


Ser 


Phe 


Pro 


Gin 


Asn 


Thr 










245 










250 










255 




Glu 


Met 


Tyr 


Tyr 
260 


Asn 


Gin 


He 


Val 


Met 
265 


Ser 


Ser 


Thr 


Asn 


Val 
270 


He 


He 


He 


Tyr 


Gly 
275 


Glu 


Thr 


Tyr 


Asn 


Phe 
280 


He 


Asp 


Leu 


He 


Phe 
285 


Arg 


Met 


Trp 


Glu 


Pro 
290 


Pro 


He 


Leu 


Gin 


Arg 
295 


He 


Trp 


He 


Thr 


Thr 
300 


Lys 


Gin 


Leu 


Asn 


Phe 


Pro 


Thr 


Arg 


Lys 


Lys 


Asp 


He 


Ser 


His 


Gly 


Thr 


Phe 


Tyr 


Gly 


Ser 


305 










310 










315 










320 


Leu v Thr 


Phe 


Leu 


Pro 


His 


His 


Gly Val 


He 


Ser 


Gly 


Phe 


Lys 


Asn 


Phe 










325 










330 










335 




Val 


Gin 


Thr 


Trp 


Phe 


His 


Leu 


Arg Asn 


Thr 


Asp 


Leu 


Tyr 


Leu 


Val 


Met 








340 










345 










350 
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uiu 


Trp 


Lys 






355 




lie 


L6U 


Lys 


Asn 




J /u 






Gin 


Lys 


Phe 


Asp 


385 








Ala 


vai 


His 


Ala 


Aid 


Asp 


Asn 


Gin 








420 




cys 


Leu 


Lys 






435 




Pro 


Pro 


Gly Asp 




450 






Glu 


Tyr 


Asp 


He 


465 








Lys 


wec 


Lys 


Leu 


Ser 


XllS 


Leu 


Tyr 








5CC 


Met 


pro 


Ser 


Ser 






515 




Leu 


Trp 


Lys 


Glu 




530 






Pro 


Glu 


Asn 


Glu 


545 








vai 


Lys 


His 


His 


Ser 


Tyr 


Leu 


Leu 








580 


rae 


rile 


He 


Gly 






595 




.L 11 XT 


iriie 


Gly 


He 




610 






Thr 


Vai 


Thr 


Vai 


625 








Leu 


Arg 


Asn 


Phe 


Cys 


Ser 


Leu 


Leu 








660 


pro 


Pro 


Phe 


Vai 






675 




lie 


vai 


Cys 


Asn 




690 








Leu 


Ala 


Cys 


705 








Lys 


Asn 


Leu 


Pro 


Met 


Leu 


Vai 


Phe 








740 


Ser 


Thr 


Lys 


Gly 






755 




Ala 


Ser 


Ser 


Ala 




770 






He 


He 


Leu 


Met 


785 








Lys 


Ser 


Tyr 


Phe 



Tyr 


•trie 


Asn 


Tyr 








360 


Asn 


Ser 


Ser 


Asn 






375 




Met 


Thr 


rile 


Ser 




390 






He 


Ala 


His 


Ala 


405 








Ala 


He 


Asp 


Asn 


Vai 


Asn 


Ser 


r*ne 








440 


Lys 


Vai 


pne 


Met 






455 




Vai 


His 


Pne 


vai 




470 






Gly Lys 


Phe 


Ser 


485 








Vai 


Asp 


Arg 


lie 


Vai 


Cys 


Ser 


Ala 








520 


Gly Met 


Ala 


Ala 






535 




He 


Ser 


Asn 


Glu 




550 






Asp 


Thr 


Pro 


He 


565 








Leu 


Met 


C A V 


Leu 


Leu 


Pro 


Asn 


Arg 








600 


Vai 


Phe 


inr 


riec 






615 




Vai 


Leu 


Al o 


irne 




630 






Leu 


Vai 


Ser 


Gly 


645 








Gin Cys 


17 sa 1 

Vai 


Leu 


Asp 


He 


Asp 


ulU 








680 


Lys 


Gly 


Ser 


\Ta 1 
V ell 






695 




Leu 


Ala 


Leu 


Gly 




710 






Asp 


Thr 


Phe 


Asn 


725 








Cys 


Ser 


Vai 


Trp 


Lys 


His 


Met 


Vai 








760 


Gly 


He 


Leu 


Gly 






775 




Arg 


Pro 


Glu 


Arg 




790 









Asp 


Ser 


Til a 


Ala 


Ser 


nVi n 

Pne 


Asp 








380 


bill 


Asn 


Ser 


nlS 






395 




Leu 


His 


Glu 


Met 




410 






Gly 


Lys 


Lys 


Glu 


425 








Leu 


Arg 


Arg 


Tl 0 
lie 


Lys 


Gin 


Arg 


Vai 








460 


Asn 


Leu 


Ser 


Gin 






475 




Pro 


Tyr 


Leu 


Pro 




490 






Glu 


Leu 


Ala 


Thr 


505 








Asp 


Cys 


Ser 


Pro 


Cys 


Cys 


Phe 


vai 








540 


Thr 


Thr 


Vai 


Vai 






555 




vai 


T ■• ,rl 

Lys 


TV 1 — 

Ala 


Asn 




570 






Met 


Ser 


Cys 


Phe 


585 








Ala 


lie 


Cys 


Vai 


Ala 


17— 1 

vai 


Ser 


Thr 








620 


Lys 


vai 


Thr 


Asp 






63 5 




inr 


Pro 


Asn 


Tyr 




650 






Cys 


Ala 


He 


Trp 


665 








HIS 


Tnr 


T All 

Leu 


AlS 


Thr 


Ala 


pne 


Tyr 








700 


Asn 


Phe 


Ser 


Vai 






715 




Glu 


Ala 


Lys 


Phe 




730 






Vai 


Thr 


Phe 


Leu 


745 








Ala 


Vai 


Glu 


He 


Cys 


He 


Phe 


Vai 








780 


Asn 


Ser 


Thr 


Gin 






795 





Ser 


Thr 


Cys 


Lys 


365 








Trp 


Leu 


Met 


Glu 


Asn 


Tl a 


Tyr 


Asn 








400 


Asn 


Leu 


m -n 
uin 


r^l r> 






415 




Pro 


Ser 


Ser 


ser 




430 






Tyr 


Phe 


Thr 


Asn 


445 








He 


Met 


His 


Asp 


His 


Leu 


Gly 


He 








480 


His 


Gly 


Arg 


His 






495 




Gly Arg 


Arg 


Lys 




510 






Gly Phe 


Arg 


Arg 


525 








Cys 


Ser 


Pro 


Cys 


Leu 


Cys 


Vai 


Phe 








560 


Asn 


Arg 


Ser 


Leu 






575 




Leu 


Cys 


Ser 


Phe 




590 






Leu 


Gin 


Gin 


He 


605 








Vai 


Leu 


Ala 


Lys 


Pro 


Gly 


Arg 


Arg 








640 


He 


He 


Pro 


Tl A 

lie 






655 




Leu 


Ala 


ir*K 1 

vai 


Ser 




670 






Gly His 


He 


He 


685 








Cys 


He 


Leu 


rtl 

Qiy 


Ala 


Phe 


Leu 


Ala 








720 


Leu 


Thr 


Phe 


Ser 






735 




Pro 


Vai 


Tyr 


His 




750 






Phe 


Ser 


He 


Leu 


765 








Pro 


Lys 


He 


Tyr 


Lys ] He 


Arg 


Glu 








800 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3689 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(ix) FEATURE: 

(A) NAME /KEY : Coding Sequence 

(B) LOCATION: 39... 419 

(D) OTHER INFORMATION: VR15 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

TCAAAATCCG CACTGCCCAA GTTTAAGGCA GGAAAAAT ATG TTC ATT TTC ATG GGA 56 

Met Phe He Phe Met Gly 

1 5 . 

GTC TTC TTC CTC CTT AAT ATT ACA CTT CTC ATG GCC AAT TTC ATT AAT 104 
Val Phe Phe Leu Leu Asn He Thr Leu Leu Met Ala Asn Phe He Asn 
10 15 20 

CCC AGG TGC TTT TGG AGA ATA AAT TTG GAT GAA ATA ACG GAT GAA TAT 152 
Pro Arg Cys Phe Trp Arg He Asn Leu Asp Glu He Thr Asp Glu Tyr 
25 30 35 

TTG GGA TTA TCT TGT ACT TTC ATC CTG GCG GCA GTT GAG ACA CCC ACT 200 
Leu Gly Leu Ser Cys Thr Phe He Leu Ala Ala Val Gin Thr Pro Thr 
40 45 50 

GAA AAA GAT TAT. TTC AAC AAG ACT CTT AAT GTT CTA AAA ACA ACT AAA 248 
Glu Lys Asp Tyr Phe Asn Lys Thr Leu Asn Val Leu Lys Thr Thr Lys 
55 60 65 70 

AAC CAC AAA TAT GCT TTG GCA TTG GTG TTT GCA ATG GAT GAA ATC AAC 296 
Asn His Lys Tyr Ala Leu Ala Leu Val Phe Ala Met Asp Glu He Asn 
75 80 85 

AGA AAT CCT GAT CTT TTA CCA AAT ATG TCT TTG ATT ATA AGA TAC ACT 344 
Arg Asn Pro Asp Leu Leu Pro Asn Met Ser Leu He He Arg Tyr Thr 
90 95 100 

TTG GGC CTT TGT GAT GGA AAA ACT GTA ACA CCT ACA CCA TAT TTA TTT 392 
Leu Gly Leu Cys Asp Gly Lys Thr Val Thr Pro Thr Pro Tyr Leu Phe 
105 110 115 

CAT AAA AAA AAA ACA AAG CCC TAT CCC TAATTATTTC TGTAATGAAG AGACTAT 446 
His Lys Lys Lys Thr Lys Pro Tyr Pro 
120 125 

GTGTTCATTT CTG CTTTCAG GACCCAAGTG GGATGTATCT TTAAGTTTCT GGATGTACCT 506 

GGACAGCTTC TTATCTCCGC GTATCCTTCA GCTTACCTAT GGACCTTTCC ATTCTATCTT 566 

CAGTGATGAT GAACAATATC CCTATCTCTA TCAGATGGCC CCAAAGGACA CATCTCTAGC 626 

ATTGGCAATG GTCTCCTTCA TACTTTATTT GAAATGGAAC TGGATTGGCC TTGTCATCCC 686 

AGATGACGAT CAAGGAAACC AATTTCTTTT AGAGTTGAAG AAACAGAGTG AAAACAAGGA 746 

AATTTG CTTT GCCTTTGTGA AAATGATCTC TGTTGATGAT ACTTCATTTC CACATAAAAC 806 

TGAAATGGAC TACAACCAAA TTGTGATGTC ATCCACAAAT GTTATTATCA TTTATGGAGA 866 

AACACGCAAT TTCATTTATT TGATCTTCAG AATGTGGGAA CCTCCCATTT TACAGAGAAT 926 

ATGGATCACC ACAAAACAAT TGAATTTCCC TACCAGGAAG ACAGACATAA GTCATGGCAC 986 

ATTCTATGGA TCACTTACTT TTCTACCCCA CCATGGTGAG ATTTCTGGCT TTAAAAAGTT 1046 

TGTACAGACA TGGTTCCATG TCAGAAACAC AGATTTATAT TTAGTAATGC CAGAGTGGAA 1106 

CTATTTTAAC TATGTAAGCT CAGCATCCAA TTGTAAAATA CTGAAGAACA ATTCATCTGA 1166 

TGCCTCATTT GATTGGCTAA TGGAACAGAA GTTTGACATG ACCTTTAGTG AGAATAGTCA 1226 

TAACATATAC AATGCTGTGC ATGCCATAGC CCATGCCCTC CATGAGATGA ATCTGCAACA 1286 

GGCTGATAAT CAGGCAATAG GCAATGGAAA AGGAGCCAGT TCTCACTGCT TGAAGGTAAA 1346 

CTCCTTTCTA AGAAGGACCT ACTTCACTAA TCCTCTTGGG GACAAAGTGT TTATGAAGCA 1406 

AAGAGTAATA ATGCAGGATG AATATGATAT TATTCACTTT GGGAATCTCT CACAACACCT 1466 
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TGGGATTAAG ATGAAGTTAG GAAAGTTCAG CCCATATTTA CCACATGGTC GACACTCTCA 1526 

CTTATATGTA GACATGATTG AGTTGG CCAC AGGAAGAAGA AAGATGCCAT CCTCTGTGTG 1586 

CAGTGCAGAT TGTAGTCCTG GATTCAGAAG ATTGTGGAAG GAGGGAATGG CAGCCTGCTG 1646 

TTTTGTTTGC AGCCCCTGCC CAGAAAATGA AATTTCTAAT GAGACAAGCT CCTCTCCATT 1706 

TCATCCTTGC ATTCAGACAG GAACAATTAT GGGCTGGAGA TGTGACTATG GGATGGGAAT 1766 

CCCATCACTC ACTTGATGTC CTGTCTTCCG GCTGGAGGTG GGCTCTTTAA GTTAACACTA 1826 

TCTACTGTAG TACATTTCAT CTAAGGTCTC TGACCTCCCA AGTCTCTGGT GCATTTTGGT 1886 

GGGTCCACCC ACCCTCCTAT TAC CTGAAGT TGCCTGTTTA TATTCTTTTT GCTGGTCCTC 1946 

AGAGATCGGT TCCCCTCTCA CCTGCCCACA CACCACAAAC CCCTTTCAAA TAACATCATA 2006 

A ATG ATACAA TGAAGTTAAG TATACAAAGA ACAAATTGCT TGGTTTTATT TCATTTAAAT 2 066 
CTTTATGAAC TTTATGAATT GAAATCAATG CTCGGCAACA GCATCCTTCA CATTACATAT . 2126 

CAGCATCAAA GGCAGCATTG CAAGGCTTCT TTCATTACCC TTACTTGAAT TACCTTGACA 2186 

ATAAAATTTC TGAAGCAGAC CTAACTAAGC TTTCCTTTGG AAATCAGATA TGGATCAATG 2246 

TGTGAATTGT CCAGAATACC AATATGCCAA CACAGAACAG AACAAATGTA TTCAGAAAGG 2306 

TGTCACCTTC CTAAGCTATG AAGACCCCTT GGGGATGGCA CTTGCCTTAA TGGCCTTCTG 2366 

CTTCTCTGCA TTCACAGCTG TGGTACTTTG TGTCTTTGTG AAGCACCATG ACACTCCTAT 2426 

TGTGAAGGCC AATA ACAGAA GCCTCAGCTA CCTATTACTC ATGTCACTCA TGTTCTGTTT 2486 

TCTGTGCTCC TTTTTCTTCA TTGGCCTTCC AAACAGAGCC ATCTGTGTCT TACAGCAAAT 2546 

CACATTTGGA ATTGTATTCA CTGTGGCTGT TTCCACAGTT CTGGCCAAAA CAGTCACTGT 2606 

GGTTCTGGCT TTCAAAGTCA CAGACCCAGG GAGAAGATTG AGAAAGTTCC TGGTATCAGG 2666 

GACACCCAAC TACATTATTC CCATATGTTC CCTACTCCAA TGTGTTCTGT GTGCAATCTG 2726 

GCTAGCAGTT TCTCCTCCCT TTGTTGATAT TGATGAACAC ACTCTCCATG GCCATATCAT 2786 

CATTGTGTGC AACAAGGGCT CAGATACTGC ATTCTACTGT ATCCTGGGAT ATTTGGCCTG 2846 

CCTGGCACTT GGAAGCTTCT CTCTGGCTTT CTTGGCCAAG AATCTGCCTG ACACATTCAA 2906 

TGAAGCCAAA TTCTTGACCT TCAGCATGCT AGTGTTCTGT AGTGTCTGGG TCACCTTCCT 2966 

CCCTGTCTAC CATAGCACCA AGGGCAAACA CATGGTTGCT GTGGAGATCT TCTCCATCTT 3026 

GGCATCCAGT GCAGGGATCC TTGGATGTAT TTTTGTACCC AAGATTTATA TCATTTTAAT 3 086 

GCGACCAGAG AGAAATTCTA CCCAAAAGAT CAGGGAAAAA TCATATTTCT GAACAAATAT 3146 

TTAGGAATTC TGTCAAATGT AAAGTTGGTA CATACCCACC AAATATTGGG TTATAGTGCA 3206 

TGTGTCTAGT TTTAGAATCA CTCTCACTGG TTGCTCTAGT GATATCAGGA AGTATCATAT 3266 

CTACTGAACT TCCCTACAGT GTCCATAAAA TCTTGCACTC ATTCACTTTC TTCATTTTCT 3326 

CTCAGAGAAC TAAACTCTCA ATTATTACAA TTTTATTCTT CATTTTGATT TCATGGAGAT 3386 

GGCCCTCTGG TAACTGCCAA AAAATGTTGA TAAGGCAGTT GAATCCACCA CTTTGTGTAG 3446 

AAAAATGAGA TCTAGGAAGA CAGGGTTACA CATAAAAACC ATCTACCAAA TCAAATAATC 3506 

AATGAGAAAC ACAGACTAAC TAAATAATCA GCAAAGATGA AATCAGAACA TATTTTCTGA 3566 

TTTCCAGTAA GAG C ACACAC ATAAGAAAAT ACTTACTTTT TTCATCTGTT CTTCAATCTA 3626 

CTGGCCAATA GTCTAAGGAG GAAATGTTCC TTTTCTGCTG TCAAATAAAA ATATATTATA 3686 

T CC 3689 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 127 amino acids 

(B) TYPE : amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 



Met 


Phe 


He 


Phe 


Met 


Gly 


Val 


Phe 


Phe 


Leu 


Leu 


Asn 


He 


Thr 


Leu 


Leu 


l 








5 










10 










15 




Met 


Ala 


Asn 


Phe 


He 


Asn 


Pro 


Arg 


Cys 


Phe 


Trp 


Arg 


He 


Asn 


Leu 


Asp 








20 










25 










30 




Glu 


lie 


Thr 


Asp 


Glu 


Tyr 


Leu 


Gly 


Leu 


Ser 


Cys 


Thr 


Phe 


He 


Leu 


Ala 






35 










40 










45 








Ala 


Val 


Gin 


Thr 


Pro 


Thr 


Glu 


Lys 


Asp 


Tyr 


Phe 


Asn 


Lys 


Thr 


Leu 


Asn 




50 










55 










60 








Val 


Leu 


Lys 


Thr 


Thr 


Lys 


Asn 


His 


Lys 


Tyr 


Ala 


Leu 


Ala 


Leu 


Val 


Phe 


65 










70 








75 










80 


Ala 


Met 


Asp 


Glu 


He 


Asn 


Arg 


Asn 


Pro 


Asp 


Leu 


Leu 


Pro 


Asn 


Met 


Ser 










85 










90 










95 




Leu- 


He 


He 


Arg 


Tyr 


Thr 


Leu 


Gly 


Leu 


Cys 


Asp 


Gly Lys 


Thr 


Val 


Thr 
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100 105 HO 

Pro Thr Pro Tyr Leu Phe His Lys Lys Lys Thr Lys Pro Tyr Pro 
115 120 125 

(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3896 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(ix) FEATURE: 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 36... 263 
(D) OTHER INFORMATION: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 



ATTTCACAAC TTCTTGATCT TAGACCTTAG CAGAT ATG AAA AAC CTG TGT GTT 53 

Met Lys Asn Leu Cys Val 
1 5 

TTC ACT CTT TCC TTT TTC CTC CTG GAG TTT TCT CTG ATC TTG TGC CAT 101 
Phe Thr Leu Ser Phe Phe Leu Leu Glu Phe Ser Leu lie Leu Cys His 
10 15 20 

TTG ACT GAA CCC ATT TGC TTT TGG AGG ATA AAT AAT AAT GAA GAT AAT 149 
Leu Thr Glu Pro lie Cys Phe Trp Arg lie Asn Asn Asn Glu Asp Asn 
25 30 35 

GAT GGA GAT TTG AGA AGT GAC TGT GGT TTT TTC CTT GCA GCA GTT GAG 197 
Asp Gly Asp Leu Arg Ser Asp Cys Gly Phe Phe Leu Ala Ala Val Glu 
40 45 50 

GGA CCT ACT GAC GAC TCT TAT AAT ATC TCT GAT CTT AGG TTT TCT TTG 245 
Gly Pro Thr Asp Asp Ser Tyr Asn lie Ser Asp Leu Arg Phe Ser Leu 
55 60 65 70 

GAC CAT TTA ATC CTA AGC TGAGTGACCA TGACCAGTTT CCCTATGTCC ATCAGGTA 3 01 
Asp His Leu He Leu Ser 
75 



GCCACCAAGG ACACACGTTT GTCCCATGCA ATGGTCTCCT TGATGTTTCA TTTTACATGG 361 

ATTTGGATAG GAATGGTCAT CTCAGATGAT GACCAGAGTA TTCAGTTTCT ATCAGACATG 421 

AGAGAAGAAA TGCAAAGACA TGGAATCTGT TTAGCTTTTG TTAATATGAT CCCAGAAGAC 481 

ATGCAGTTAT ATATGACAAG GGCTACAATA TATGATAAAC AAATTATGGA ATCAACAGCA 541 

AAGGTTGTTA TGATTTATGG TGAAATGAAC TCTACCTTAG AAGTTAGCTT TAGAAGGTGG 601 

GAAGATTTAA GTATAAGGAG AATCTGGATC ACAACCTCAC AATGGGACGT TATCACAAAT 661 

AAAAATGATT TCAGCCTTGA TTTCTTCCAA GGGACTGTCA CTTTTGCACA CCATGTAGGT 721 

GAAATTG CTA ACTTTAGGAA TTTCTTGCAA ACAATGAACA GTGAAAAATA CACAGTAAAC 781 

ATTTCTGAGT CTAGACTGGG GTGGAATTAT TTTAATTGTT CCATCTCTAA GAACAGCAAT 841 

AAAAAGGATC ATTTTACATT CAACAACACA TTGGAATGGA CAACACTGCA CAAATATGAC 901 

ATGGTCCTAA GTGAGGAAGG CTACAATTTG TATAATGCTG TGTATGCTGT GGCCCACACC 961 

TACCATGAAC TCGTTCTTCA ACAAGTAGAA TCTCAGCAAA TGACAGTACC CAAAGGAACA 1021 

TTCACTGACT GTCAGCAGGT GTCTTCCATG CTGAAGTCCA GGATATTTAC TAACCCTGTT 1081 

GGAGAACTGG TGAACATGAA GCATAGGGAA AATCAGTGTA CAGAGTATGA TATTTTCATC 1141 

ATTTGGAATT TTCCACAAGG CCTTGGATTA AAAGTGAAAA TAGGAAGCTA TTTGCCTTGT 1201 

TTCCAACAGA GCCAACAACT TCATATATCT GAAGATTTGG AGTGGGCCAC AGGAGGATCA 1261 

TCAGTACCCC CCTCCCTGTG TAGTGTAACA TGTACTGCTG GATTCAGGAA AATTCATCAG 1321 

AAACAAACAG CAGACTGCTG CTTTGATTGT GATCAGTGCC CAGAAAATGC AGTTTCCAAT 1381 

GAAACAGAGA TATGCAATCT GAACATGGAA AGACCATCAT TATTTGCAAC AAAGGCTCAG 1441 
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TAATTGCCTT CCACTTTGTT CTCGGATACT TGGGTGCCTT GGCTCTGGGG AGCTTTACTG 1501 

TGGCTTTCTT GGCTAGGAAC CTTCCTGACA GATTCAATGA AGCCAAATTC TTAAC CTTCA 1561 

GCATGCTGGT GTTCTGCAGT GTCTGGATCA CCTTCCTCCC TGTCTACCAC AGCACCCAGG 1621 

GAACGGTCAT GGTGGTTGTG GAGGTTTTCT CCATCTTGGC TTCTAGTGCA GGCTTGCTAG 1681 

GGTGTATCTT TCTCCCAAAA TGTTGTGTTT TATTACGTAT ACAAAATTCA AACTTTCTGC 1741 

ATAAGTACAA ACATGAATTG CATTCTTGAT TCTTTAGTAA TTTAAAAATG CTAATCATAC 1801 

TCAACTTATC TTTTTGCTTT GTCATAACAA AAGCACCACT AAATCATACA AAAAATTTAA 1861 

GTAATATACA AATTTAGTAT TTACAATGTA GGGCAGCACA GCACTGCCTA ATGTAATGCC 1921 

AATTATTGTT TTAGAGGTAA ATGGTCTTAT TCATGTGTAC ATAGATGTAA ACATTGAGAA 1981 

TAGGGAATCT AACTTGATGA ATGGCTATCA ACACTTTGAC CTCTAGGTAT GTGTGTAAGC 2041 

CATGTACCTA ATTTAATATG TAATAAGGTG AGCGTAACAT ATGTGAGAGT GCTACCTCTG 2101 

GGCAGAAAGT TCTGGGAATT ATAAGAAAGA GGACTTCAAA GAGCACAGGC ATGAAGTCAA 2161 

TAATCAGCAT TATTCCATGT GCTCTCATTG AGTGTCTGCA TCCACGTTCT TGTCTTGACT 2221 

TCATTCTATT AACTGTGACT AAGGTACATA GGGAAATAGG ACTTTTCTCA CATGGTTCCT 2281 

TTGACCATGG TGTTTTCTTA CAGCAACAGA CTCTAAGACA TCAGCAAAAT GTTAAATTGC 2341 

CTTGGTTAGG ATTTGGAATA TCACAGATTA CTGATGCAAT AGAAGGCACT GATTTGAAAG 2401 

AGAAAATAGA TTGAATACTA GGGGAGTGTG AGCATAGTTA CAGTGTTGCA TATTGTTGAT 2461 

GGCCATCACA GAGGCCTGAG ATTTGTAATT GCTTCATAAT GTACTATGAA AATATTCAGA 2521 

ATATCAGGTA ACATACTAAA AGAAGTACAA TATATGAAAA GGACAATGGG GTTCAGATTA 2581 

TGCCTGCTCT ATAAGGCTCA TGAACTTCAT ATGAAAACAT ACCATTTCAA TATGAAATGA 2641 

AGAAGTTTCA TTCAGGGAGA AAAATTGGTA GTGGAAAAAT TTACACACAA GACCTATATC 2701 

ACAAGGAGAT CAGTGAAATC TTGGAATATA TAAGGCACTC TAGAAGAATG ACTTCAAAAA 2761 

TGTTAGCAAA ATAGGAACAA CTAAGAATTA TTTGGTTTAA TATTACATAA TCAAAGATGT 2821 

ACATACAAAC ACATGAACAT TATTATTTCT GGACGTCAGT TGCTGAAGGT CAGTGTCATT 2881 

TTCTCTCAAA GTATTGTTTG TTGCTCTTAT TTTACTTGTT AATTTACAGT TTATTTTTGA 2941 

TGGGATAATT TAATTGTTTT TTTCTTTATA TTTCCTGTCT CAAGAACACC ACTTGTAGCC 3001 

CATCCATACA CTCCTAAAAT GCAAATGACC TATTATTTCA TTAATGCTTA * ATGAATGCAT 3061 

GCATGTATTT GTATATACAT ATACATTTTA AAGTATACAT TGTAGATACT ATGTAAAATT 3121 

GCATGTTTTT ATGTTTTGAT GGCTCATTAT TTGGTAATAC CTGGCCAATA TTTGTTCCCT 3181 

TCCCTGGCTA TGACAACCTC CTCCATTCCC TGATTTAAAG TTTCCTGTAA ATGGTTGTGT 3241 

AGGGTAGAAG CTTTGAAAGC TTTCTTCCTT CCACGCTGCC ATGCACAGTG CAGTAATCCT 3301 

TCTTCAGACC ATATTTTGTG TGTCATATTG GTAAAACTTC ATGGTCTACT TATGCTAGTT 3361 

CTAGAAGATT TGTGTTCACA GCCAGTTTCC TCATCCTTTG ACTCACAAGA TCTTTTCCAC 3421 

CATCTTCTTT ACGTTTCTCT GAGCCTTGGA TGAGGGAAAA TTTTGTAAGA GGATACATTG 3481 

AATTGTTTCC TTCAACTACC TACTCTGGAA ATGACTATCA CACTATCACA ACATCTTTAA 3541 

AAACAAGATG GAACTCCAAA ATCATTTTCT AAGGAAATAA ATGAAAATCT AAGTGTTCTT 3601 

TTAATCTGGT TCATTGGAAT TTCCTGCATT TATCTGCCTG GGTGTATGTA ATCCCCCCCC 3661 

CCCAGCCTGA AACCTGGCTG AACAGGTTTC ACTGTTAGCA CGAAGAGAGA ATCCGGGGTG 3721 

GAGCCTTCCA CCCTATCATT CTGCCACTCC CACTGCTACT GCCTGCCGCC CAGCTGTTCC 3781 

GGAGCTATCA CGTGGTCACC TGAAATTGGA CTCCAAGGAT GATTTGGAGG GAATGGGTGC 3841 

CTTCCCCTTC TTCATAAACC AGTGTCTGGG AATAGTAAAA TTGAACTTTG ATCAG 3896 

(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 76 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 



Met 
1 


Lys 


Asn 


Leu 


Cys 


Val 


Phe 


Thr 


Ser 


Leu 


He 


Leu 


5 
Cys 


His 


Leu 


Thr 








20 










Asn 


Asn 


Asn 


Glu 


Asp 


Asn 


Asp Gly 






35 










40 


Phe 


Leu 


Ala 


Ala 


Val 


Glu 


Gly 


Pro 




50 










55 




Asp 


Leu Arg 


Phe 


Ser 


Leu 


Asp 


His 


65 










70 







Leu 


Ser 
10 


Phe 


Phe 


Leu 


Leu 


Glu Phe 
15 


Glu 


Pro 


He 


Cys 


Phe 


Trp Arg He 


25 










30 




Asp 


Leu 


Arg 


Ser 


Asp 
45 


Cys 


Gly Phe 


Thr 


Asp 


Asp 


Ser 
60 


Tyr 


Asn 


He Ser 


Leu 


He 


Leu 
75 


Ser 
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(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2811 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(ix) FEATURE: 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 962... 2605 

(D) OTHER INFORMATION: GoVNl 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

GAAACGTCTA CTAATATGCT GTTCTCTTGG CTTTTTATCT CCTTGTTTCT ACAGATGCCA 60 

ACTCTCATCT GGACGATTGC AACCCGTTCC TGCCTAACTG AATCAGGATA CCTCGTACAC 120 

CAGGATGGAG CTGTGGTCAT TGGTGCATTT TTTCCTGTTT TAAAGTCCTT GCCTATAAGT 18 0 

GAAATAATAG ATTGGAAAAC ATTATCTTTT GACACATACA ATTCTTTATG GATAAATGCA 240 

CAAATGTACC AACTTGTTTT GGCCATGATA TTTGCGATCA ATGAGATCAA TGTGAAGTCC 300 

CATATTTTAC CAAATACCTC TCTGGGACTT GAGATTTATA ATCTGCCATA TTTTGAACGG 360 

AATATTCTGA GGAGTGCACT ATCTTGGCTC ACAGGCTTGA GCAAATTCAT TCCTAATTAC 420 

ACCTGCAGAA AGGATAGCAA ATCAGCTGCT GCACTTACTG GAATATCACA GAAAACATCT 480 

GAGACCTTTG GGACTTTGTT GGACATTTAC AAATTTCCTC AGCTTAATTT TGGGCCGTGT 540 

GATCCTGTTC AGATAGGCAG AAACCAGTTT CCATCTGTGT ACCAGGTGGC CCCCAAAGAC 600 

ACACCTCTGT TCTGTGGTAT CACCTCTTTG ATGCTTCATT TCAACTGGAC CTGGGTGGGA 660 

CTGCTAATCA CAGATGACAA CAGAGGTTCT CAGTTTCTAT CAGAGTTAAG AAAGGAGCTG 720 

GACAAGAATA AAATCTGCAT AGCCTTTGTG GAAACAGTAA TATTTTTTGG GGAATCATTG 780 

C ATT ATATGC TAACCCACAA TCAGATGCAG ACTCTAGAGT CATCAGCAAA TGTGATTATA 840 

GTTTATGGAC ATTTTGCTTT TCAATTAATT GTAATACAAA GTAAACACAG AAAGTATGAA 900 

ATGAAAAAGA TTTGGGTCAT AACCTCAAAA TGGGTTGGCC AAAAAAATTG AACAATATAC 960 

C ATG TTA GAA TTG GCC CAT GGC ACT CTG ACT TTC TCA CCC CAT CAT GGG 1009 
Met Leu Glu Leu Ala His Gly Thr Leu Thr Phe Ser Pro His His Gly 
15 10 15 

GAG ATT TCT GAT TTC ACA AAT TTT ATG CAG GAA GTC ACC CCT ATC AAG 1057 
Glu lie Ser Asp Phe Thr Asn Phe Met Gin Glu Val Thr Pro lie Lys 
20 25 30 

TAG CCA GAA GAC ATT TTT CTT CAC ATC TTG TGG AAC CAG TAT TTC AAT 1105 
Tyr Pro Glu Asp lie Phe Leu His lie Leu Trp Asn Gin Tyr Phe Asn 
35 40 45 

TGT CCA CTT TTG CAT TCT GAG TGT AAA ATC TTT GAA AAC TGT ATA CCC 1153 
Cys Pro Leu Leu His Ser Glu Cys Lye lie Phe Glu Asn Cys He Pro 
50 55 60 

AAT GCC TCT TTG GAA TTG TTG CCA GGG GGT GTT TTT GAG CTG GTC ATG 1201 
Asn Ala Ser Leu Glu Leu Leu Pro Gly Gly Val Phe Glu Leu Val Met 
65 70 75 80 

ACT GAA GAG AGT TAC AAT GTG TAC AAT GCT GTG TAT GCA GTG GCC CAC 1249 
Thr Glu Glu Ser Tyr Asn Val Tyr Asn Ala Val Tyr Ala Val Ala His 
85 90 95 

AGT CTC CAT GAG AAG GCT CTC CAT CAA GTA GAA ATT CAA CCA CAG GAT 1297 
Ser Leu His Glu Lys Ala Leu His Gin Val Glu He Gin Pro Gin Asp 
100 105 110 

AAT AAA GAT AGG ACT ATA TTA TTT CCT TGG CAG CTT CAC CCT TTT CTG 1345 
Asn Lys Asp Arg Thr He Leu Phe Pro Trp Gin Leu His Pro Phe Leu 
115 120 125 



WO 99/00422 PCT/US98/13680 

- 123 - 

AAG AAC ATT CAG CTG ATA AAT TCT GTT GGT GAT CGT GTG ATT CTG GAC 1393 
Lys Asn lie Gin Leu lie Asn Ser Val Gly Asp Arg Val lie Leu Asp 
130 135 140 

TGG AAA AAG AAG ACG GAT ACA GAG TAT GAT ATT TCC AAT ATT TGG AAT 1441 
Trp Lys Lys Lys Thr Asp Thr Glu Tyr Asp He Ser Asn He Trp Asn 
145 150 155 160 

TTC CCA ACA GGT CTT TCC TTA TTA GTG AAA GTG GGT ACA TTT GCT CCA 1489 
Phe Pro Thr Gly Leu Ser Leu Leu Val Lys Val Gly Thr Phe Ala Pro 
165 170 175 

AGT GCT CCC AAG GGG GAA CAA CTT TCG ATA TCT GAA CAC ACA ATT AAC 1537 
Ser Ala Pro Lys Gly Glu Gin Leu Ser He Ser Glu His Thr He Asn 
180 185 190 

TGG CCC ATA GGA TTT ACA GAG ATT CCA AAG TCT GTA TGC AGT GAG AGC 1585 
Trp Pro He Gly Phe Thr Glu He Pro Lys Ser Val Cys Ser Glu Ser 
195 200 205 

TGC AGT CCT GGA CAC AGG AAA GTC ATC CTG GAG AGC AAG CCT GCC TGT 1633 
Cys Ser Pro Gly His Arg Lys Val He Leu Glu Ser Lys Pro Ala Cys 
210 215 220 

TGC TTT GAC TGC ACT CCT TGC CCA GAT AAA GAG ATT TCC AAC GAG ACA 1681 
Cys Phe Asp Cys Thr Pro Cys Pro Asp Lys Glu He Ser Asn Glu Thr 
225 230 235 240 

GAT GTG GGT CAG TGT GTG AAG TGT CCT GAA TCT CAT TAT GCA AAT ACA 1729 
Asp Val Gly Gin Cys Val Lys Cys Pro Glu Ser His Tyr Ala Asn Thr 
245 250 255 

GAG AAG AGT CAC TGC CTG AAG AAG ACT ATG ACC TTT CTG GAT TAT AAT 1777 
Glu Lys Ser His Cys Leu Lys Lys Thr Met Thr Phe Leu Asp Tyr Asn 
260 265 270 

GAT TCC TTG GGG ACG GGA CTC ACA CTC ATG TCT CTG GGA TTC TTT GTT 1825 
Asp Ser Leu Gly Thr Gly Leu Thr Leu Met Ser Leu Gly Phe Phe Val 
275 280 285 

GTC ACA GGT CTT GTT ATT GGG GTT TTT ATA ATC CAC AGA AAC ACT CCA 1873 
Val Thr Gly Leu Val He Gly Val Phe He He His Arg Asn Thr Pro 
290 295 300 

ATT GTG AAG GCC AAT AAT AGA TCT CTC AGT TAT ATC CTG CTC ATC ACT 1921 
He Val Lys Ala Asn Asn Arg Ser Leu Ser Tyr He Leu Leu He Thr 
305 310 315 320 

CTC ACT CTC TGT TTC CTT TGT CCC TTG CTC TTC ATT GGG CTT CCA AAC 1969 
Leu Thr Leu Cys Phe Leu Cys Pro Leu Leu Phe He Gly Leu Pro Asn 
325 330 335 

ACA GCC ACA TGT ATC CTA CAG CAG AAC TTG TTT GGA CTT CTC TTC ACT 2017 
Thr Ala Thr Cys He Leu Gin Gin Asn Leu Phe Gly Leu Leu Phe Thr 
340 345 350 

GTG GCT CTA TCC ACA GTG TTG GCC AAA ACT ATC ACT GTA GTT ATG GCA 2 065 
Val Ala Leu Ser Thr Val Leu Ala Lys Thr He Thr Val Val Met Ala 
355 360 365 

TTC AAG ATT ACT GCT CCA GGA AGA AAG ACA AGA TGG TTG CTG ATA TTA 2113 
Phe Lys He Thr Ala Pro Gly Arg Lys Thr Arg Trp Leu Leu He Leu 
370 375 380 



AGA GCC CCT CAG TTC 



ATC ATT CCA CTT TGT GCC CTG ATG 



CAA ATC CTT 
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Arg Ala Pro Gin Phe lie lie Pro Leu Cys Ala Leu Met Gin lie Leu 
385 390 395 400 

TTC TCT GGG ATA TGG CTG GGA ACA TCT CCT CCA TTT GTT GAC ATG GAT 2209 
Phe Ser Gly lie Trp Leu Gly Thr Ser Pro Pro Phe Val Asp Met Asp 
405 410 415 

GCT CAC TCT GAA CAT GGG CAC ATC ATC ATT CTA TGC AAC AAG GGC TCA 2257 
Ala His Ser Glu His Gly His He He He Leu Cys Asn Lys Gly Ser 
420 425 430 

GCT ATT GGC TTC TAC TGT ACT CTG GCC TAC CTG GGA GTC ATG GCC TTT 2305 
Ala He Gly Phe Tyr Cys Thr Leu Ala Tyr Leu Gly Val Met Ala Phe 
435 440 445 

GGT AGT TAC CTC TTG GCT TTC ATG TCC AGG AAT CTT CCT GAC ACA TTT 2353 
Gly Ser Tyr Leu Leu Ala Phe Met Ser Arg Asn Leu Pro Asp Thr Phe 
450 455 460 

AAT GAA TCC AAG GCC CTG GCT TTC AGC ATG CTG ATG TTC TGC AGT GTC 2401 
Asn Glu Ser Lys Ala Leu Ala Phe Ser Met Leu Met Phe Cys Ser Val 
465 470 475 480 

TGG GTC ACA TTC CTC CCT GTC TAC CAC AGC ACC ACT GGG AAG GTC AGG 2449 
Trp Val Thr Phe Leu Pro Val Tyr His Ser Thr Thr Gly Lys Val Arg 
485 490 495 

GTG GCT ATG GAA ATG TTT TCT ATC TTG GCT TCC AGT GCA AGC ATT CTA 2497 
Val Ala Met Glu Met Phe Ser He Leu Ala Ser Ser Ala Ser He Leu 
500 505 510 

ACC CTA ATC TTT GTC CCT AAG TGC TAC ATT GTT TTG TTC AGA CCA GAG 2545 
Thr Leu He Phe Val Pro Lys Cys Tyr He Val Leu Phe Arg Pro Glu 
515 520 525 

AGG AAC ATA CTT CCT CTA AAC AGA GAA AAA AGA CAG CAT AGG AGT AAA 2593 
Arg Asn He Leu Pro Leu Asn Arg Glu Lys Arg Gin His Arg Ser Lys 
530 535 540 

AAT TCT GAA ACA TAGCAGTCAA GACAAACATT GGCCTAGCAC AAAATGTCTG ATTGT 2650 

Asn Ser Glu Thr 

545 

TGGCATTTCT CCTGCTATAT AAACAATTAG TCCTTTGACT TTGAGGACAG GATCACATGA 2710 

GACAGACCGG TGATATTGCT TCAAATTATG TAAAATATGT GACATGGTTA TATTGACCAA 2770 

TAAAATACTT GTTCTTGTAT GAAAAAAAAA AAAAAAAAAA A 2811 



(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 548 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii> MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:34: 



Met Leu Glu Leu Ala His Gly Thr Leu Thr Phe Ser Pro His His Gly 

15 10 15 

Glu He Ser Asp Phe Thr Asn Phe Met Gin Glu Val Thr Pro He Lys 

20 25 30 

Tyr- Pro Glu Asp He Phe Leu His He Leu Trp Asn Gin Tyr Phe Asn 
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35 40 45 

Cya Pro Ijeu Leu His Ser Glu Cys Lys He Phe Glu Asn Cys He Pro 

50 55 60 

Asn Ala Ser Leu Glu Leu Leu Pro Gly Gly Val Phe Glu Leu Val Met 
65 70 75 80 

Thr Glu Glu Ser Tyr Asn Val Tyr Asn Ala Val Tyr Ala Val Ala His 

85 90 95 

Ser Leu His Glu Lys Ala Leu His Gin Val Glu He Gin Pro Gin Asp 

100 105 no 

Asn Lys Asp Arg Thr He Leu Phe Pro Trp Gin Leu His Pro Phe Leu 

115 120 125 

Lys Asn He Gin Leu He Asn Ser Val Gly Asp Arg Val He Leu Asp 

130 135 140 

Trp Lys Lys Lys Thr Asp Thr Glu Tyr Asp He Ser Asn He Trp Asn 
145 150 155 160 

Phe Pro Thr Gly Leu Ser Leu Leu Val Lys Val Gly Thr Phe Ala Pro 

165 170 175 

Ser Ala Pro Lys Gly Glu Gin Leu Ser He Ser Glu His Thr He Asn 

180 185 190 

Trp Pro He Gly Phe Thr Glu He Pro L*'s Ser Val O r s Ser rn *- 

195 200 J 205 

Cys Ser Pro Gly His Arg Lys Val He Leu Glu Ser Lys Pro Ala Cys 

210 215 220 

Cys Phe Asp Cys Thr Pro Cys Pro Asp Lys Glu He Ser Asn Glu Thr 
225 230 235 240 

Asp Val Gly Gin Cys Val Lys Cys Pro Glu Ser His Tyr Ala Asn Thr 

245 250 255 

Glu Lys Ser His Cys Leu Lys Lys Thr Met Thr Phe Leu Asp Tyr Asn 

260 265 270 

Asp Ser Leu Gly Thr Gly Leu Thr Leu Met Ser Leu Gly Phe Phe Val 

275 280 285 

Val Thr Gly Leu Val He Gly Val Phe He He His Arg Asn Thr Pro 

290 295 300 

He Val Lys Ala Asn Asn Arg Ser Leu Ser Tyr He Leu Leu He Thr 
305 310 315 320 

Leu Thr Leu Cys Phe Leu Cys Pro Leu Leu Phe He Gly Leu Pro Asn 

325 330 335 

Thr Ala Thr Cys He Leu Gin Gin Asn Leu Phe Gly Leu Leu Phe Thr 

340 345 350 

Val Ala Leu Ser Thr Val Leu Ala Lys Thr He Thr Val Val Met Ala 

355 360 365 

Phe Lys He Thr Ala Pro Gly Arg Lys Thr Arg Trp Leu Leu He Leu 

370 375 380 

Arg Ala Pro Gin Phe He He Pro Leu Cys Ala Leu Met Gin He Leu 
385 390 395 400 

Phe Ser Gly He Trp Leu Gly Thr Ser Pro Pro Phe Val Asp Met Asp 

405 410 415 

Ala His Ser Glu His Gly His He He He Leu Cys Asn Lys Gly Ser 

420 425 430 

Ala He Gly Phe Tyr Cys Thr Leu Ala Tyr Leu Gly Val Met Ala Phe 

435 440 445 

Gly Ser Tyr Leu Leu Ala Phe Met Ser Arg Asn Leu Pro Asp Thr Phe 

450 455 460 

Asn Glu Ser Lys Ala Leu Ala Phe Ser Met Leu Met Phe Cys Ser Val 
465 470 475 480 

Trp Val Thr Phe Leu Pro Val Tyr His Ser Thr Thr Gly Lys Val Arg 

485 490 495 

Val Ala Met Glu Met Phe Ser He Leu Ala Ser Ser Ala Ser He Leu 

50a 505 510 

Thr Leu He Phe Val Pro Lys Cys Tyr He Val Leu Phe Arg Pro Glu 

515 520 525 

Arg Asn He Leu Pro Leu Asn Arg Glu Lys Arg Gin His Arg Ser Lys 

530 535 540 

Asn Ser Glu Thr 
545. 
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(2) INFORMATION FOR SEQ ID NO: 35: 

<i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3584 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : cDNA 
(ix) FEATURE: 

(A) NAME /KEY: Coding Sequence 

(B) LOCATION: 273... 2576 

(D) OTHER INFORMATION: GoVN2 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:35: 

CACACTGCCC AGGTTTAAGG CAGAAAGAAT ATGTTCATTT TGATGGTAGT ATTTTTCCTT 60 
CTCCACCATC CACTTCTCAT GGCAAATTTC ATCGATCCCT GGTGCTTTTG GAGAAGAAAT 120 
TTGAATGAAG TCAAGGAAAA AAACTTGGAT ATAAATTGTG CCTTCATCCT TGGAGCAGTT 180 
CAGTTGCCTA TGGAGAAAGA TATTTCAATG AGACTTTGAA TGTCCTAAAA ACAACTAAAA 240 
ACAACAAATA TGCCTTGGCA TTAGCCTTTT CA ATG GAG GAA ATC AAC AGG AAC 293 

Met Glu Glu He Asn Arg Asn 
1 5 

CCT GAT CTT TTA CCA AAT ATG TCT TTG GTT ATA AAA CAT ACT TTG AGC 341 
Pro Asp Leu Leu Pro Asn Met Ser Leu Val He Lys His Thr Leu Ser 
10 15 20 

TAT TGT GAT GGA AAT ACT GCA GAC CAT ATA TTT AAA GAA AAA TTT TAT 389 
Tyr Cys Asp Gly Asn Thr Ala Asp His He Phe Lys Glu Lys Phe Tyr 
25 30 35 

AAG CCT TTA CCT AAT TAT GTC TGT AAT GAA GAG ACT ATG TGT TCA TTT 437 
Lys Pro Leu Pro Asn Tyr Val Cys Asn Glu Glu Thr Met Cys Ser Phe 
40 45 50 55 

ATG CTT ATA GGG CTG AAT TGG GTA TTG TCT CTA ACA CTT TTT AAA GAC 485 
Met Leu He Gly Leu Asn Trp Val Leu Ser Leu Thr Leu Phe Lys Asp 
60 65 70 

TTG GAC ATC TTC TCA TTT CCA CGT TTC CTT CAA ATT TCC TAT GGA CCT 533 
Leu Asp He Phe Ser Phe Pro Arg Phe Leu Gin He Ser Tyr Gly Pro 
75 80 85 

TTC CAT TCC ATC TTC AGT GAT AAT GAA CAA TTT CCA TAT CTC TAT CAG 581 
Phe His Ser He Phe Ser Asp Asn Glu Gin Phe Pro Tyr Leu Tyr Gin 
90 95 100 

ATG ACC CCA AAG GAC ACA TCA CTA GCA TTG GCA ATT GTC TCC TTC TTA 629 
Met Thr Pro Lys Asp Thr Ser Leu Ala Leu Ala He Val Ser Phe Leu 
105 HO us 

CTT TAC TTC AAT TGG AAC TGG GTT GGG CTT GTC ATC TCT GAT AAT GAT 677 
Leu Tyr Phe Asn Trp Asn Trp Val Gly Leu Val He Ser Asp Asn Asp 
120 125 130 135 

GAA GGC AAT CAA TTT CTC TCA GAG TTG AAA AAA GAG ACC CAA AAC AAG 725 
Glu Gly Asn Gin Phe Leu Ser Glu Leu Lys Lys Glu Thr Gin Asn Lys 
140 145 150 

GAA ATT TGC TTT GCC TTT GTT AAC ATG ATG TCA ATC CAT GAG CAT TCA 773 
Glu He Cys Phe Ala Phe Val Asn Met Met Ser He His Glu His Ser 
155 160 165 
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TCT TAT CAA AAA ACT GAA ATG TAC TAG AAT CAA ATA GTG ATG TCA TCA 821 
Ser Tyr Gin Lys Thr Glu Met Tyr Tyr Asn Gin lie Val Met Ser Ser 
170 175 180 

ACA AAT ATT ATT ATC ATT TAT GGG AAA ACA AAC AGT ATC ATT GAA TTG 869 
Thr Asn lie He He He Tyr Gly Lys Thr Asn Ser He He Glu Leu 
185 190 195 

AGC TTC AGA ATG TGG GTA TCT CCA GTT ATA CAG AGG ATT TGG GTC ACA 917 
Ser Phe Arg Met Trp Val Ser Pro Val He Gin Arg He Trp Val Thr 
200 205 210 215 

AAC TCA GAG TTG GAT TTC CCG ACA AGT ATG AGA GAC TTC ACT CAT GGC 965 
Asn Ser Glu Leu Asp Phe Pro Thr Ser Met Arg Asp Phe Thr His Gly 
220 225 230 

ACA TTC TAT GGG ACT CTG ACA TTT CTA CAC CAC CAT GGT GAG ATT TCT 1013 
Thr Phe Tyr Gly Thr Leu Thr Phe Leu His His His Gly Glu He Ser 
235 240 245 

GGA TTT ACA AAT TTT TTC GAG ACA TGG GAC CAT CTC AGA AGC AGA GAT 1061 
Gly Phe Thr Asn Phe Phe Glu Thr Trp Asp His Leu Arg Ser Arg Asp 
250 255 260 

TTA AAT CTA TTA ATA CCA GAG TGG AAG TAC TTT AGC TAT GAT GCC TCA 1109 
Leu Asn Leu Leu He Pro Glu Trp Lys Tyr Phe Ser Tyr Asp Ala Ser 
265 270 275 

GGA TCT AAC TGT AAA ATA TTG AGG AAC TAT TCA TCC AAT GCC TCA TTG 1157 
Gly Ser Asn Cys Lys He Leu Arg Asn Tyr Ser Ser Asn Ala Ser Leu 
280 285 290 295 

GAA TGG ATA ACA GAA CAG AAG TTT CAC ATG GCC TTT AAT GAT TAT AGT 1205 
Glu Trp He Thr Glu Gin Lys Phe His Met Ala Phe Asn Asp Tyr Ser 
300 305 310 

CAT AGT ATA TAT AAT GCT GTG TAT GCC ATG GCC CAT GCC CTC CAT GAG 1253 
His Ser He Tyr Asn Ala Val Tyr Ala Met Ala His Ala Leu His Glu 
315 320 325 

ACT AAT CTG CAA GAG GTT GAT AAT AAG GAA ATA AGA AAT GGG AAA GGA 1301 
Thr Asn Leu Gin Glu Val Asp Asn Lys Glu He Arg Asn Gly Lys Gly 
330 335 340 

GCA AGT ACT CAC TGC TTG AAG GTA AAC TCA TTT. CTC AGA AAG ACC CAC 1349 
Ala Ser Thr His Cys Leu Lys Val Asn Ser Phe Leu Arg Lys Thr His 
345 350 355 

TTT ACT AAT TCT CAT GGA GAG AGA GTG ATT ATG AAA CAG AGA GTG AGA 1397 
Phe Thr Asn Ser His Gly Glu Arg Val He Met Lys Gin Arg Val Arg 
360 365 370 375 

GTA CAG GAA GAC TAT GAC ATT GTT CAC ATT CAG AAT TTC TCA CAA CAC 1445 
Val Gin Glu Asp Tyr Asp He Val His He Gin Asn Phe Ser Gin His 
380 385 390 

CTT CGG ATT AAG ATG AAG ATA GGA AAG TTC AGC CCA TAT TTT ACA CAT 1493 
Leu Arg He Lys Met Lys He Gly Lys Phe Ser Pro Tyr Phe Thr His 
395 400 405 

GGT GGA CCC TTT CAC TTA TAT GAA GAC ATG ATT CAG TTG GCC ACA GGA 1541 
Gly Gly Pro Phe His Leu Tyr Glu Asp Met He Gin Leu Ala Thr Gly 
410 415 420 



AGT AGA AAG ATG CCG TCC TCT GTG TGC AGT GCA GAT TGT AGT CCT GGA 
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Ser Arg Lys Met Pro Ser Ser Val Cys Ser Ala Asp Cys Ser Pro Gly 
425 430 435 

TTC AGA AAA TCC TGG AAG GAG GGA ATG GCC CCC TGC TGT TTT ATT TGC 1637 
Phe Arg Lys Ser Trp Lys Glu Gly Met Ala Pro Cys Cys Phe lie Cys 
440 445 450 455 

AGC CTG TGC CCT GAA AAT GAA ATT TCT AAT GAG ACA AAT ATG GAT CAA 1685 
Ser Leu Cys Pro Glu Asn Glu lie Ser Asn Glu Thr Asn Met Asp Gin 
460 465 470 

TGT GTG AAT TGT CCA GAA TAC CAA TAT GCC AAC ACA GAA AAG AAC AAA 1733 
Cys Val Asn Cys Pro Glu Tyr Gin Tyr Ala Asn Thr Glu Lys Asn Lys 
475 480 485 

TGC ATT CAG AAA GAC GTG ATT TTT CTA AGC TAT GAA GAC CCC TTG GGA 1781 
Cys He Gin Lys Asp Val lie Phe Leu Ser Tyr Glu Asp Pro Leu Gly 
490 495 500 

ATG GCT CTT GCC TTA ATT GCC TTC TGT TTG TCT GCA TTC ACA GCT GTG 182 9 
Met Ala Leu Ala Leu lie Ala Phe Cys Leu Ser Ala Phe Thr Ala Val 
505 510 515 

GTA CTT TGG GTC TTT GTG AAG CAC CAT GAC ACT CCT ATT GTG AAG GCC 1877 
Val Leu Trp Val Phe Val Lys His His Asp Thr Pro lie Val Lys Ala 
520 525 530 535 

AAT AAC AGA ATC CTC AGC TAC ATA TTA ATC ATG TCA CTA ATG TTC TGT 1925 
Asn Asn Arg lie Leu Ser Tyr lie Leu lie Met Ser Leu Met Phe Cys 
540 545 550 

TTT CTC TGC TCC TTT TTC TTC ATT GGC CAT CCT AAC AGA GGT ACC TGT 1973 
Phe Leu Cys Ser Phe Phe Phe He Gly His Pro Asn Arg Gly Thr Cys 
555 560 565 

ATC TTA CAG CAA ATC ACA TTT GGC ATT GTA TTC ACT GTG GCT GTT TCC 2021 
He Leu Gin Gin He Thr Phe Gly He Val Phe Thr Val Ala Val Ser 
570 575 580 

ACA GTT CTG GCC AAA ACA ATC ACT GTC ATT CTT GCT TTC AAA CTC AGA 2069 
Thr Val Leu Ala Lys Thr He Thr Val lie Leu Ala Phe Lys Leu Arg 
585 590 595 

GAC CCA GGG AGA AGT TTA AGA AAC TTC CTG GTA TCT GGT GCA CCC AAC 2117 
Asp Pro Gly Arg Ser Leu Arg Asn Phe Leu Val Ser Gly Ala Pro Asn 
600 605 610 615 

TAC ATT ATT CCT ATA TGT TCC TTA TTG CAA TGT ATT CTG TGT GCA ATT 2165 
Tyr He He Pro He Cys Ser Leu Leu Gin Cys He Leu Cys Ala He 
620 625 630 

TGG CTA GCA GTT TCT CCT CCT TTT GTT GAT ATT GAT GAA CAT TCT GAG 2213 
Trp Leu Ala Val Ser Pro Pro Phe Val Asp He Asp Glu His Ser Glu 
635 640 645 

CAT GGC CAC ATC ATG ATT GTG TGC AAC AAG GGC TCC ATT ATG GCA TTC 2261 
His Gly His He Met He Val Cys Asn Lys Gly Ser He Met Ala Phe 
650 655 660 

TAC TGT GTC CTA GGA TAC TTG GCC TGC CTG GCG CTT GGA AGC TTC ACT 2309 
Tyr Cys Val Leu Gly Tyr Leu Ala Cys Leu Ala Leu Gly Ser Phe Thr 
665 670 675 

ACA GCT TTC TTG GCA AAG AAT CTG CCA GAC ACA TTC AAC GAA GCC AAG 2357 
Thr Ala Phe Leu Ala Lys Asn Leu Pro Asp Thr Phe Asn Glu Ala Lys . 
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680 685 690 695 

TTC TTG ACC TTC AGC ATG CTA GTG TTC TGC AGT GTC TGG GTC ACC TTT 2405 
Phe Leu Thr Phe Ser Met Leu Val Phe Cys Ser Val Trp Val Thr Phe 
700 705 7X0 

CTC CCT GTG TAC CAT AGC ACA AGG GGC AGG GTC ATG GTT GCT GTT GAG 2453 
Leu Pro Val Tyr His Ser Thr Arg Gly Arg Val Met Val Ala Val Glu 
715 720 725 

ATC TTC TCT ATC TTG GCA TCC AGT GCA GGG ATG TTT GGA TGC ATC TTT 2501 
He Phe Ser He Leu Ala Ser Ser Ala Gly Met Phe Gly Cys He Phe 
730 735 740 

GCA CCC AAA ATC TAC ATC ATA TTA ATG AAA CCA GAA AGA AAT TCT ATA 2549 
Ala Pro Lys lie Tyr He He Leu Met Lys Pro Glu Arg Asn Ser He 
745 750 755 

CAA AAG TTC AGG GAG AAA TCA TAT TTC TAAACAAATA TTTCAGGAAT TTAGTTG 2603 
Gin Lys Phe Arg Glu Lys Ser Tyr Phe 
760 765 

AATATTAAGT TGGTATATAC CCACCAAATA TTTGGTTATT GTGCATGTAT AGAGTTTTAG 2663 

AATCAGTCTT ACTGATTCCT CTATTGCTGT CTAGAGGTAT CTTATCTACC AGTCTTGCAT 2723 

ACATTGTCCA TAAAATCTTG TACTCATTCA CTTCTTTAGT TTCCTCTGAG AAAACTAAAT 2783 

TTCTCAAATT ATTACTAAAA TGTAATTCAA CATTATGCTT TCATGGATAT. TTCCCCCTGG 2843 

TTACATCAGA TAAATTTGAT AAGACAGCTG ATTTTGTTAC CTTATATAGA AGGTATATGA 2903 

ATGTCCTGCC TTACAGGACA GAGAGGAATT ACACTTAGAA ACCGTCTATC AAGTCAAACA 2963 

TTCAATCATA CTGAAAAATA AACTAAAGGA TCAACAGAGA TAAAAAGCAG AATACATTTT 3023 

CTGTTTTCTA GTCGGAGCAT ATACATGACA GAATTCTGTT TTTATTTACA GTTGCTCTTC 3083 

AAGGTTTTGG TCAATAGTCT AAGATGCAAA TGTTTTCTTT TTTTCTGATC TCAAAAAAAA 3143 

TATTATAGCC AACAATTGAA AGAAGCCAGT GACCACTGTG TTTAAATTAG GAACTAGTTT 3203 

GAGGATCCTG AGAAGGAGGG TGACTCATTG GAAGACCAGC AGTCTTATCT AACCTGAATA 3263 

ACAAAGAATT TTCAGACACT GAGCCTCTAA CCGGGCAGCA TACACCAGTT GATATGAAGC 3323 

CCCCAACATA TATGCAACAT AGGATGTCCT GGTCTGGCCT TGGTGAGAGA AGACACACCT 3383 

AACCCCCAAG AGACATGATG CTCAAGGGAT TGGGAAGGTG TGGGAGTTGG GAAGGTGGGG 3443 

ACTACTTCTT GATGCTGGGA AAGGAGATAT GGGGTGAGGA AGTGTCAGTG CTCAGACTGG 3503 

GAAAGGGATA ATGAGTTCAC AGTAAAAAAA ATGTTAAAGA ATAAAAATCT AAAACAAAAT 3563 

TAAAAAAAAA AAAAAAAAAA A 3584 

(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 768 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:36: 



Met 


Glu 


Glu 


He 


Asn 


Arg 


Asn 


Pro 


Asp 


Leu 


Leu 


Pro 


Asn 


Met 


Ser 


Leu 


1 








5 










10 










15 




Val 


He 


Lys 


His 


Thr 


Leu 


Ser 


Tyr 


Cys 


Asp 


Gly 


Asn 


Thr Ala Asp His 








20 










25 










30 






He 


Phe 


Lys 
35 


Glu 


Lys 


Phe 


Tyr 


Lys 
40 


Pro 


Leu 


Pro 


Asn 


Tyr 
45 


Val 


Cys 


Asn 


Glu 


Glu 


Thr 


Met 


Cys 


Ser 


Phe 


Met 


Leu 


He 


Gly 


Leu 


Asn 


Trp 


Val 


Leu 




50 










55 










60 








Ser 


Leu 


Thr 


Leu 


Phe 


Lys 


Asp 


Leu 


Asp 


He 


Phe 


Ser 


Phe 


Pro 


Arg 


Phe 


65 










70 










75 








80 


Leu 


Gin 


He 


Ser 


Tyr 


Gly 


Pro 


Phe 


His 


Ser 


He 


Phe 


Ser 


Asp 


Asn 


Glu 



85 90 95 
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Gin 


Phe 


Pro 


Tyr 


Leu 


Tyr 


Gin 


Met 








100 










Leu 


Ala 


He 


Val 


Ser 


Phe 


Leu 


Leu 






115 










120 


Leu 


Val 


He 


Ser 


Asp 


Asn 


Asp 


Glu 




130 










135 




Lvs 


Lvs 


Glu 


Thr 


Gin 


Asn 


Lvs 


Glu 


145 










150 






Met 


Ser 


He 


His 


Glu 


His 


Ser 


Ser 










165 








Asn 


Gin 


He 


Val 


Met 


Ser 


Ser 


Thr 








180 










Thr 


Asn 


Ser 


He 


He 


Glu 


Leu 


Ser 






195 










200 


lie 


Gin 


Arcr 


He 




Val 


Thr 


Asn 




210 














Met 




Asp 


Phe 


Thr 


His 


Glv 


X XXX. 


225 










« o w 






His 


His 


His 


Glv 


Glu 


He 


Ser 


Glv 










245 








Asp 


His 


Leu 


Arcr 


Ser 


Arcr 


Asp 


Leu 








260 










Tyr 


Phe 


Ser 


Tvr 


Asp 


Ala 


Ser 


Glv 






275 










280 


Tvr 


Ser 


Ser 


Asn 


Ala 


Ser 


Leu 


Glu 




290 










295 




Met 


Ala 


Phe 


Asn 


Asp 


Tvr 


Ser 


His 


305 










310 






Met 


Ala 


His 


Ala 


Leu 


His 


Glu 


Thr 










325 








Glu 


He 


Arcr 


Asn 


Gly 


Lvs 


Gly 


Ala 








340 










Ser 


Phe 


Leu 


Arcr 


LVS 


Thr 


His 


Phe 






355 










360 


lie 


Met 


Lys 


Gin 


Arcr 


Val 


Arcr 


Val 




370 










375 




lie 


Gin 


Asn 


Phe 


Ser 


Gin 


His 


Leu 


385 










390 






Phe 


Ser 


Pro 


Tvr 

* Jr * 


Phe 


Thr 


His 


Glv 










405 








Met 


He 


Gin 


Leu 


Ala 


Thr 


Glv 


Ser 








420 










Ser 


Ala 


Aso 


Cvs 


Ser 


Pro 


Glv 


Phe 






435 












Ala 


Pro 


Cys 




Phe 


He 


Cys 






450 










455 




Asn 


Glu 


Thr 


Asn 


Met 


Asp 


Gin 


Cvs 


465 










470 






Ala 


Asn 


Thr 


Glu 


Lys 


Asn 


Lys 


Cvs 










485 








Ser 


Tyr 


Glu 


Asp 


Pro 


Leu 


Glv 


Met 








500 










Leu 


Ser 


Ala 


Phe 


Thr 


Ala 


Val 


Val 






515 










520 


Asp 


Thr 


Pro 


He 


Val 


Lys 


Ala 


Asn 




530 










535 




He 


Met 


Ser 


Leu 


Met 


Phe 


Cys 


Phe 


545 










550 




His 


Pro 


Asn 


Arg 


Gly 


Thr 


Cys 


He 










565 








Val 


Phe 


Thr 


Val 


Ala 


Val 


Ser 


Thr 








580 










He 


Leu 


Ala 


Phe 


Lys 


Leu 


Arg 


Asp 






595 










600 


Leu 


Val 


Ser 


Gly 


Ala 


Pro 


Asn 


Tyr 
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Thr 


Pro 


Lvs 


Asd 

****** 


Thr 


Ser 


Leu 


Ala 


105 










110 






Tvr 


Phe 


Asn 


Trr> 


Asn 


Trn 


Val 


Glv 
uxy 


















Glv 


Asn 


Gin 


Phe 


Leu 


Ser 


Glu 


Leu 








140 










He 


Ova 


Phe 


Ala 


Phe 


Val 


As 


Mat- 






i ^ r 

Xj J 










1 <tn 
xou 


Tvr 


Gin 


Lys 


Thr 


Glu 


Met 


Tvr 

xyxr 


Tvr 
xyx 




170 










17c 
x / 3 




Asn 


He 


He 


He 


He 


Tvr 


Glv 


Lys 


1 QC 










1 on 






Phe 




Met 


x irp 


Val 


Ser 


Pro 


Val 


















Ser 


Glu 


Xjeu 


ASp 


Phe 


pro 


TVir 
X XXX 


Ser 




















i-yr 


m v 

uxy 


T"hr 
X XIX. 


Lieu 


X XXX 


Jrfie 


Leu 
















^4 U 


Phe 


Thr 


Asn 


Xt XXC 


XT XXC 


17J.U 


X XXX 


Trp 




*? n 










9 5 




Asn 


Leu 


Leu 


He 


Pro 


VJX. u 


Trp 


Lys 


265 










^ / u 






Ser 


Asn 


Cys 


Lys 


He 


Leu 


Arcr 


Asn 










0 ft r 








Trio 


He 


Thr 


Glu 


Gin 


Lys 


Phe 


His 








300 










Ser 


He 


Tvr 


Asn 


Ala 


Val 


Tvr 


Ala 






315 










320 


Asn 


Leu 


Gin 


Glu 


Val 


Asp 


Asn 


Lys 


















Ser 


Thr 


His 


Cvs 


Leu 


Lys 


Val 


Asn 


345 










350 






Thr 


Asn 


Ser 


His 


Gly 


Glu 




Val 










3 £ 5 
J D 3 








Gin 


Glu 


Asp 


Tvr 
xyx 


Asp 


He 


Val 


His 


















Arcr 


He 


Lys 


Met 


Lys 


He 


Gly 


Lys 
















Ann 


Glv 


Pro 


Phe 


His 


eu 


Tyr 


Glu 


Asp 




410 










A1 5 




Arcr 


Lys 


Met 


Pro 


Ser 


Ser 


Val 


Cys 


425 










43 0 






Arcr 


Lys 


Ser 




Lys 


Glu 


Glv 


Met 










AA 5 








Leu 


Cvs 


Pro 


Glu 


Asn 


Glu 


He 


Ser 








YOU 










Val 


Asn 


Cys 


Pro 


Glu 


Tvr 

Ay XT 


Gin 


Tvr 
xyx 






A 7R 










Aft n 


He 


Gin 


Lys 


Asp 


Val 


He 


Phe 


Leu 




490 










40c 




Ala 


Leu 


Ala 


Leu 


He 


Ala 


Phe 




505 














Leu 


Trp 


Val 


Phe 


Val 


Lvs 


His 


His 










525 








Asn 


Arg 


He 


Leu 


Ser 


Tyr 


He 


Leu 








540 










Leu 


Cys 


Ser 


Phe 


Phe 


Phe 


He 


Gly 






555 










560 


Leu 


Gin 


Gin 


He 


Thr 


Phe 


Gly 


He 




570 










575 




Val 


Leu 


Ala 


Lys 


Thr 


He 


Thr 


Val 


565 










590 






Pro 


Gly 


Arg 


Ser 


Leu 


Arg 


Asn 


Phe 










605 








He 


He 


Pro 


He 


Cys 


Ser 


Leu 


Leu 
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610 615 620 





aXC UCU 




Ti - 

lie 


irp 


Leu 


n 1 a 
nla 


Val 


Ser 


fro 


Pro 


Fne 


vax 


625 




630 










635 










CA ft 


Asp He 




Ser 


UlU 


W*i a 
nio 


«xy 


1115 


He 


Met 




Val 


cys 


Asn 




645 




















655 




Lys Gly 


Ser 1 Tip Mpt 

w jl new 


Ala 


Phe 


Tvr 


cys 


Vol 


Leu Gly 


Tyr 


Leu 


Ala 


Cys 




660 








665 










670 




Leu Ala 


Leu Gly Ser 


Phe 


Thr 


Thr 


Ala 


Phe 


Leu 


Ala 


Lys 


Asn 


Leu 


Pro 




675 






680 










685 








Asp Thr 


Phe Asn Glu 


Ala 


Lys 


Phe 


Leu 


Thr 


Phe 


Ser 


Met 


Leu 


Val 


Phe 


690 






695 










700 










Cys Ser 


Val Trp Val 


Thr 


Phe 


Leu 


Pro 


Val 


Tyr 


His 


Ser 


Thr 


Arg 


Gly 


705 




710 










715 








720 


Arg Val 


Met Val Ala 


Val 


Glu 


He 


Phe 


Ser 


He 


Leu 


Ala 


Ser 


Ser 


Ala 




725 










730 










735 




Gly Met 


Phe Gly Cys 


lie 


Phe 


Ala 


Pro 


Lys 


He 


Tyr 


He 


He 


Leu 


Met 




740 








745 










750 






Lys Pro 


Glu Arg Asn 


Ser 


He 


Gin 


Lys 


Phe 


Arg 


Glu 


Lys 


Ser 


Tyr 


Phe 




755 






760 










765 







(2) INFORMATION FOR SEQ ID NO; 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3578 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(ix) FEATURE: 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 1181... 3181 
(D) OTHER INFORMATION: GOVN3 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

CTATCTTGAA GAGTGCTTTT CTGTGTAACT TGCTTTGCTG CACGTTTACA AATTATTTTT 60 

TCTTGGTGAA ATTACTAAGA TGTTCTCTTT TCTGTTTGCA ATTCTTGTCC TGAAGCTTTC 120 

TTTTCCTTTG TGCAGTCCAA TTGACAACCG TTGTTTTTGG AGATTAAAAA CCAAGACATT 180 

TTGGGAAGGA GACAAAGAAC TTGATTGCTT TTTTTTTATT TATACAAGGT TTGGTCATGT 240 

AAAGAATGAA CAGTTCAGTG GGAATCTAGA CAAGCGGTTG ACATCTAAGA CTATCCACTT 300 

G ATTT TGACT CTTTATTTTG CCCTTGAAGA AATAAACAGG AACCCCCATA TTCTACCTAA 360 

CATTTCACTG CTAGTTAAAA TTGAATGTGG GCTGCTAGAT GATTGGACAA TAAACAGTTT 420 

ATCTTCTAAA AGAGAAAAAT ATCTT.CCTAA CTACTACTGT ATAAATCAGA GAAGATATTT 480 

AATTGTACTT ACAGGACCAA TGTGGTTAGC ATCTGTCATA GTTGGGCCAC TCCTATACAT 54 0 

AACTAAGAGG CCAGAGATGG ATCAACTCAA CTCTTCTGGC TCAAATTCTT CCCTAAAGTC 600 

ACTAATTGGA TATGGCTTTA CTCAGCTTCT CATTGATTTG CTTTGCTTGA ACAATCACTG 660 

CCCATTTGTT TTAGTCTTCT GTCTCCTTTA TATTCTGGCT ACAACTGCCT CTACTGATGC 720 

ACATTGAACT GCATGAACTC ACAAATTAAC TCAACACCAT TGCACTGCAT TCTTTGCACT 780 

GAGTCTCAAA AGTCTGGTTT AACTCTTCTG CATTGAACTC AACTGACTAA TTAGAACTCA 840 

GAAATCTGCA TCCCTCTGTC TCCTGAGTAC TTTGATTAAA GGTGTGTACT ATCACACCTG 900 

CACCTAAACT TTTCTATACT AAAAATTTGC TTTATACTAG GCTGACCTTG AACTAAGTGA 960 

TCTGCTTGCC TCTGTCTCCT GCCTTCCAAG GAATGCCTAT TTCCCAGCAG GATATTTTTT 1020 

GCCTACAAGT CTTCAGATGT GATCCATTAA GTATAGTCAT GTTGCTGGAT TAAAATTCCT 1080 

CTACAGATTT AATTTTCTGA TCCTGAGGCT AGTGAAACTT TACTATGGGC CATTTCACCC 1140 

TCTCTTGAGC AACCAAGAAC TGTATCCATA TCTTTACCAA ATG GCT CCT AAG GAC 1195 

Met Ala Pro Lys Asp 
1 5 

ACA TCT CTG GCA CTG GCC ATG GTT TCT TTG TTT GTC CAT TTC AGC TGG 1243 
Thr Ser Leu Ala Leu Ala Met Val Ser Leu Phe Val His Phe Ser Trp 
10 15 20 
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AAC TGG GTA GGA GCT GTT GTT TCA GAT GAT GAC CCA GGT TAT GAA TTT 1291 
Asn Trp Val Gly Ala Val Val Ser Asp Asp Asp Pro Gly Tyr Glu Phe 
25 30 35 

ATC TTG GAA TTG AGA AGA GAA ATG CAA AGG AAC AAT TTT TGT TTA GCA 1339 
He Leu Glu Leu Arg Arg Glu Met Gin Arg Asn Asn Phe Cys Leu Ala 
40 45 50 

TTT GTG AGT ATC ATT GTT AGT GAT GAC AAT TTA TTT CTG AAA AGG TAT 1387 
Phe Val Ser He He Val Ser Asp Asp Asn Leu Phe Leu Lys Arg Tyr 
55 60 65 

AAT ATC TAT TAG AAC CAG ATC AAG ATG TCA TCA GCA AAA GTT GTT ATC 1435 
Asn He Tyr Tyr Asn Gin He Lys Met Ser Ser Ala Lys Val Val He 
70 75 80 85 

ATT TAT GGA GAC AAA GAC TCT CCT CTA CAG GTG AAC TTT AGA CTA TGG 1483 
He Tyr Gly Asp Lys Asp Ser Pro Leu Gin Val Asn Phe Arg Leu Trp 
90 95 100 

AAT TTA TTT GAT ATC CAA AGA ATC TGG GTC ACT ACT TCA CAG TGG GAT 1531 
Asn Leu Phe Asp He Gin Arg He Trp Val Thr Thr Ser Gin Trp Asp 
105 no lis 

ATG ATC ATA AAT AAT GGA AAA TTC CTC CTT AAT TCC TTC TAT GGG ACT 1579 
Met He He Asn Asn Gly Lys Phe Leu Leu Asn Ser Phe Tyr Gly Thr 
120 125 130 

CTC AGT TTT TCA CAT CAC TAT TCT GAA TTA TCT GGT TTT AAA ACA TTT 1627 
Leu Ser Phe Ser His His Tyr Ser Glu Leu Ser Gly Phe Lys Thr Phe 
135 140 145 

ATC CAG ACA GCA TAC CCT TCA AAC TAC AGT GAT GAC TTT TCT CTT GGT 1675 
He Gin Thr Ala Tyr Pro Ser Asn Tyr Ser Asp Asp Phe Ser Leu Gly 
150 155 160 165 

ATA TTA TGG TGG GTG TAT TTT AAT TGT TCT TTG TCA TTA TCT GAA TGT 1723 
He Leu Trp Trp Val Tyr Phe Asn Cys Ser Leu Ser Leu Ser Glu Cys 
170 175 180 

AAG AAT CTG CAA AAT TGT CCA AAG GAA AAC ATA TTT AGA TGG TTA TAC 1771 
Lys Asn Leu Gin Asn Cys Pro Lys Glu Asn He Phe Arg Trp Leu Tyr 
185 190 195 

AGG CAC CAT TTT GAA ATG TCT TTG AGT GAT ACT ACT TAT GAC CTA TAT 1819 
Arg His His Phe Glu Met Ser Leu Ser Asp Thr Thr Tyr Asp Leu Tyr 
200 205 210 

AAT TCT ATG TAT GCT GTG GCT TAC ACA CTC CAA CAG ATG CTT CTG AAA 1867 
Asn Ser Met Tyr Ala Val Ala Tyr Thr Leu Gin Gin Met Leu Leu Lys 
215 220 225 

CAA GCA GAT ACA TGG CAA ATA GAT GAT GGA AAA GAA CCA GAA TTT GAC 1915 
Gin Ala Asp Thr Trp Gin He Asp Asp Gly Lys Glu Pro Glu Phe Asp 
230 235 240 245 

TCT TGG CAG ATG CTC TCT TTC CTG AGA AAT ATC CAA TTT ATA AAC CCT 1963 
Ser Trp Gin Met Leu Ser Phe Leu Arg Asn He Gin Phe He Asn Pro 
250 255 260 

GTT GGT GAC AAA GTG AAC CTG AAT CAT GAA GAA AAA CTG GAT ACA AAG 2011 
Val Gly Asp Lys Val Asn Leu Asn His Glu Glu Lys Leu Asp Thr Lys 
265 270 275 



TAT GAG ATT CAC CAG ACT TTG ACT TTT TTG CCA AAT CCT GTA TTT AAG 



2059 
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Tyr Glu He His Gin Thr Leu Thr Phe Leu Pro Asn Pro Val Phe Lys 
280 285 290 

CTG AAA ATA GGA ACA TTT TCC CAA AAC TTA TCA CAT GGT CGA CAA TTA 2107 
Leu Lys He Gly Thr Phe Ser Gin Asn Leu Ser His Gly Arg Gin Leu 
295 300 305 

TAT ATG TTG AAA GAA ATG ATA GAG TGG AAC ACA GGC CAC CAA CAG TCT 2155 
Tyr Met Leu Lys Glu Met He Glu Trp Asn Thr Gly His Gin Gin Ser 
310 315 320 325 

CCA ACC TCA GTT TGC AGT ATT CCT TGT AGT CCA GGA TTC AGA AAA TCC 2203 
Pro Thr Ser Val Cys Ser He Pro Cys Ser Pro Gly Phe Arg Lys Ser 
330 335 340 

CCT CAG CTG GGA AAG CCT GTT TGC TGT TTT GAT TGT ACA CCC TGC CCA 2251 
Pro Gin Leu Gly Lys Pro Val Cys Cys Phe Asp Cys Thr Pro Cys Pro 
345 350 355 

GAA AAT GAA ATT TCC AAC ATG ACA AAC ATG AAT CAA TGT ATC AAG TGT 2299 
Glu Asn Glu He Ser Asn Met Thr Asn Met Asn Gin Cys He Lys Cys 
360 365 370 

CTA AAT GAT CAG TAT GCC AAT CCT GGA GGA ACT CGC TGC CTC AAA AAA 2347 
Leu Asn Asp Gin Tyr Ala Asn Pro Gly Gly Thr Arg Cys Leu Lys Lys 
375 380 385 

GTT ATT GTA TTC CTG GGT TAT GAA GAT CCA TTG GGA ATG TCT CTG GCT 2395 
Val He Val Phe Leu Gly Tyr Glu Asp Pro Leu Gly Met Ser Leu Ala 
390 395 400 405 

ATC TTG GCT CTG TGC TTC TCT GCT CTC ACA GCT TTT GTA CTT AGT ATC 2443 
He Leu Ala Leu Cys Phe Ser Ala Leu Thr Ala Phe Val Leu Ser He 
410 415 420 

TTT TTG AAG CAC CAA GAA ACA CCC ACT GTC AAG GCC AAT AAT AGA ACT 2491 
Phe Leu Lys His Gin Glu Thr Pro Thr Val Lys Ala Asn Asn Arg Thr 
425 430 435 

CTC AGC TAT GTT CTA CTC ATC TCC CTC ATC TCT TGT TTT CTC TGC TCC 2539 
Leu Ser Tyr Val Leu Leu He Ser Leu He Ser Cys Phe Leu Cys Ser 
440 445 450 

TTG CTC TTC ATT GGT CAT CCC AGC TTT ACC ACA TGT ATC ATG CAG CAG 2587 
Leu Leu Phe He Gly His Pro Ser Phe Thr Thr Cys He Met Gin Gin 
455 460 465 

ACC ACA TTT GCT GTT GTG TTC ACT GTA GCT GCA TCT ACT GTC TTG GCC 2635 
Thr Thr Phe Ala Val Val Phe Thr Val Ala Ala Ser Thr Val Leu Ala 
470 475 480 485 

AAA ACA ATT ATT GTA ATA TTG GCC TTC AAG GTT ACT AAT ACA AGT AGA 2683 
Lys Thr He He Val He Leu Ala Phe Lys Val Thr Asn Thr Ser Arg 
490 495 500 

AAA ATG AGG TGG CTG CTG GTA TCA GGG GCA CCT AAA TTC ATC ATT CCA 2731 
Lys Met Arg Trp Leu Leu Val Ser Gly Ala Pro Lys Phe He He Pro 
505 510 515 

ATT TGC ACA ATG ATT CAA CTG ATT CTC TGT GGA ATT TGG CTG GGT ACT 2779 
He Cys Thr Met He Gin Leu He Leu Cys Gly He Trp Leu Gly Thr 
520 525 530 



TCT CCT CCA TTT GTT GAT GCT GAT GGA CAT GTT GAA AAA GGC CAC ATT 
Ser Pro Pro Phe Val Asp Ala Asp Gly His Val Glu Lys Gly His He 



2827 
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535 540 545 

TTG ATT TTC TGT AAC AAA GGT TCA ATT CTT GCT TTC TAT TGT GTC CTG 2875 
Leu lie Phe Cys Asn Lys Gly Ser He Leu Ala Phe Tyr Cys Val Leu 
550 555 560 565 

GGA TAC TTA GTC TCC ATT GCC ATT GCA AGT TTC ACC CTT GCA TTC TTC 2923 
Gly Tyr Leu Val Ser He Ala He Ala Ser Phe Thr Leu Ala Phe Phe 
570 575 580 

GCC AGA AAT CTG CCC GAC ACA TTC AAT GAA GCC AAG TTC CTA ACA TTC 2971 
Ala Arg Asn Leu Pro Asp Thr Phe Asn Glu Ala Lys Phe Leu Thr Phe 
585 590 595 

AGT ATG CTA GTA TTT TGC AGT GTC TGG GTC ACC TTT CTT CCT GTC TAT 3019 
Ser Met Leu Val Phe Cys Ser Val Trp Val Thr Phe Leu Pro Val Tyr 
600 605 610 

CAT AGC ACC AAG GGC AAG TCT ATG GTG GCT GTG GAA GTT TTC TGT ATA 3067 
His Ser Thr Lys Gly Lys Ser Met Val Ala Val Glu Val Phe O r s lis 
615 620 625 

TTG GCC TCT AGT GCA GGG CTG CTT TTT TGC ATC TTT GCA CCA AAG TGC 3115 
Leu Ala Ser Ser Ala Gly Leu Leu Phe Cys He Phe Ala Pro Lys Cys 
630 635 640 645 

TTC ATT ATT TTG TTA AGA CCT GAG AAA AAA TCT TTT CAG AAG TTT CAG 3163 
Phe He He Leu Leu Arg Pro Glu Lys Lys Ser Phe Gin Lys Phe Gin 
650 655 660 

AAT ATA CAT TCT AAA ATT TAAAACATTC ATTAAATTTT TCTGACACAC TTG CT AG A 3219 
Asn He His Ser Lys He 
665 



CCAAACTTAT TCAGAAGACT CCACTGACAC TACTAGTTGA AATCAAATTT TAGATCCAAA 3279 

CATGGAATTT GTTCCCAATA AAGAAAGGAA GCACTATGTA TTAGAATTTA AAAACACGTC 3339 

TTAAATCTTG GTTCTCATAA ATCAAACTGT ATGATCAGTC ATTTCAATAA CTGTTTGCTG 3399 

TATTTCTTAA TTTTATGCTT ATACTTGAAG AATGTAAAGA CTGGGAATTG GTTCTGAGTT 3459 

TTATGAATTA ATTTCTAATT TTACTTTCCT TGGAAAAAAT GTCTAGTGTG TGTTGTTGTG 3519 

CTCTATAATA AATAATTATG AGATAAATGC AAAAAAAAAA AAAAAAAAAA AAAAAAAAA 3578 



(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 667 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 



Met 


Ala 


Pro 


Lys 


Asp 


Thr 


Ser 


Leu 


Ala 


Leu 


Ala 


Met 


Val 


Ser 


Leu 


Phe 


1 








5 










10 










15 




Val 


His 


Phe 


Ser 


Trp 


Asn 


Trp 


Val 


Gly 


Ala 


Val 


Val 


Ser 


Asp 


Asp 


Asp 








20 










25 










30 


Pro 


Gly 


Tyr 


Glu 


Phe 


He 


Leu 


Glu 


Leu 


Arg 


Arg Glu 


Met 


Gin 


Arg 


Asn 






35 










40 










45 






Asn 


Phe 


Cys 


Leu 


Ala 


Phe 


Val 


Ser 


He 


He 


Val 


Ser 


Asp 


Asp 


Asn 


Leu 


Phe 


50 










55 










60 






Leu 


Lys 


Arg 


Tyr 


Asn 


He 


Tyr 


Tyr 


Asn 


Gin 


He 


Lys 


Met 


Ser 


Ser 


65 










70 










75 








80 


Ala- 


Lys 


Val 


Val 


He 


He 


Tyr 


Gly 


Asp 


Lys 


Asp 


Ser 


Pro 


Leu 


Gin 


Val 
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Asn 


Phe 


Arcr 


Leu 








100 


Thr 


Sex* 


Gin 


TrtJ 






115 




Ser 


Phe 


Tvr 


Glv 




130 






Gly 


Phe 


Lys 


Thr 


145 








Asp 


Phe 


Ser 


Leu 


Ser 


Leu 


Ser 


Glu 








180 


Phe 


Arcr 


Trn 


Leu 






195 




Thx 




Asp 


Leu 




210 






Gin 


Met 


Leu 


Leu 


225 








Glu 


Pro 


Glu 


Phe 


Gin 


Phe 


He 


Asn 








260 


Lys 


Leu 


Asp 


Thr 






275 




Asn 


Pro 


Val 


Phe 




290 






His 


Gly 


Arg 


Gin 


305 








Glv 


His 


Gin 


Gin 


Glv 


Phe 


Arcr 


Lys 








340 


Cvs 


Thr 


Pro 


Cvs 






355 




Gin 


Cys 


He 


Lys 




370 






Arg 


Cys 


Leu 


Lys 


385 








Gly 


Met 


Ser 


Leu 


Phe 


Val 


Leu 


Ser 








420 


Ala 


Asn 


Asn 


Arcr 






435 




Cvs 


Phe 


Leu 


Cvs 




450 






Cys 


He 


Met 


Gin 


465 








Ser 


Thr 


Val 


Leu 


Thr 


Asn 


Thr 


Ser 








500 


Lys 


Phe 


He 


He 






515 




He 


Trp 


Leu 


Gly 




530 






Glu 


Lys 


Gly 


His 


545 








Phe 


Tyr 


Cys 


Val 


Thr 


Leu 


Ala 


Phe 








580 


Lys 


Phe 


Leu 


Thr 



595 



Trn 


Asn 


Leu 


Phe 


Asp 


Met 


He 


He 








120 


Thr 


Leu 


Ser 


Phe 






135 




Phe 


He 


Gin 


Thr 




150 






Gly 


He 


Leu 


Trp 


165 








Cvs 


Lys 


Asn 


Leu 






U-i D 


Mi a 








200 


Tvr* 


Asn 










215 




Lys 


Gin 


Ala 


- 




230 






Asp 


Ser 


Trp 


Gin 


245 






Pro 


Val 


Glv 


Hen 


T ,vn 

XJJr 0 


•pvr 

i-yr 


Glu 

\JJ>U 


Tip 

lie 








280 


Lys 


Leu 


Lys 


He 






295 




Leu 




Met 


Leu 




310 






Ser 


Pro 


Thr 


Ser 


325 








Ser 


Pro 


Gin 


Leu 


Pro 


Glu 


Asn 


Glu 








360 


Cys 


Leu 


Asn 


Asp 






375 




Lvs 


Val 


He 


Val 




390 






Ala 


He 


Leu 


Ala 


405 








He 


Phe 


Leu 


Lys 


Thr 


Leu 


Ser 


Tvr 
xyx 








440 


Ser 


Leu 


Leu 


Phe 






455 




Gin 


Thr 


Thr 

X 1AX 


Pho 
rue 




470 






Ala 


Lys 


Thr 


He 


485 








Arg 


Lys 


Met 


Arg 


Pro 


He 


Cys 


Thr 








520 


Thr 


Ser 


Pro 


Pro 






535 




He 


Leu 


He 


Phe 




550 






Leu 


Gly 


Tyr 


Leu 


565 








Phe 


Ala 


Arg 


Asn 


Phe 


Ser 


Met 


Leu 



■ 


135 - 








90 






Asp 


He 


Gin 


Arg 


105 








Asn 


Asn 


Gly 


Lys 


Ser 


His 


His 


Tyr 








140 


Ala 


Tyr 


Pro 


Ser 






155 




Trp 


Val 


Tyr 


Phe 




170 






Gin 


Asn 


Cys 


Pro 


185 








Phe 


Glu 


Met 


Ser 


Tyr 


Ala 


Val 


Ala 








220 


Thr 


Trp 


Gin 


He 






235 




Met 


Leu 


Ser 


Phe 




250 






Lys 


Val 


Asn 


Leu 


265 








His 


Gin 


Thr 


Leu 


Gly 


Thr 


Phe 


Ser 








300 


Lys 


Glu 


Met 


He 






315 




Val 


Cys 


Ser 


He 




330 






Gly 


Lys 


Pro 


Val 


345 








He 


Ser 


Asn 


Met 


Gin 


Tyr 


Ala 


Asn 








380 


Phe 


Leu 


Gly 


Tyr 






395 




Leu 


Cys 


Phe 


Ser 




410 






His 


Gin 


Glu 


Thr 


425 








Val 


Leu 


Leu 


He 


He 


Gly 


His 


Pro 








460 


Ala 


Val 


Val 


Phe 






475 




He 


Val 


He 


Leu 




490 






Trp 


Leu 


Leu 


Val 


505 








Met 


He 


Gin 


Leu 


Phe 


Val 


Asp 


Ala 








540 


Cys 


Asn 


Lys 


Gly 






555 




Val 


Ser 


He 


Ala 




570 






Leu 


Pro 


Asp 


Thr 


585 








Val 


Phe 


Cys 


Ser 











He 


Ar P 


V CLX 


Tnr 

X 111. 




xxu 






Phe 


Leu 


Leu 


Asn 


125 








Ser 


Glu 


Leu 


Ser 


Asn 


Tyr 


O A V 

ser 


Asp 








loU 


Asn 


Cys 


Ser 


Leu 






175 




Lys 


(jiU 


Asn 


ne 




190 






Leu 


Ser 


Asp 


Thr 


205 








Tyr 


TVtr 

inr 


Leu 


uin 


Asp 


Asp 


vjiy 


Lys 








*5 >1 A 


T on 


Arcr 

- — 


Aon 


Tl <=» 






255 




Asn 


HIS 


G1U 


G1U 




270 






Thr 


Phe 


Leu 


Pro 


285 








(jin 


Asn 


Leu 


Ser 


v?±u 


Trp 


Asn 


inr 








"* *5 ft 


Pro 


Cys 


Ser 


Pro 






i *i 




cys 


Cys 




Asp 




i c n 

J D If 






Thr 


Asn 


Met 


Asn 


365 








Pro 


VaXy 




1 XXX 




Asp 


Pro 


Leu 








ado 


Ala 


XJC5U 


Thr 


Ala 






*k J. 3 




Pro 


Thr 


Val 






a i n 






Ser 


Leu 


He 


Ser 


445 








Ser 


JrXltS 


Tnr 

XXXX7 


TVir 


inr 


Val 


nx,a 


A±Si 








a o n 


ivxa 


file 


Lys 


Val 










Ser 


Glv 


Ala 


Pro 




510 






He 


Leu 


Cys 


Gly 


525 








Asp 


Gly 


His 


Val 


Ser 


He 


Leu 


Ala 








560 


He 


Ala 


Ser 


Phe 






575 




Phe 


Asn 


Glu 


Ala 




590 






Val 


Trp 


Val 


Thr 



605 
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Phe 


Leu 


Pro 


Val 


Tyr 


His 


Ser 


Thr 


Lys 


Gly Lys Ser Met 


Val 


Ala 


Val 




610 










615 






620 








Glu 


Val 


Phe 


Cys 


lie 


Leu 


Ala 


Ser 


Ser 


Ala Gly Leu Leu 


Phe 


Cys 


He 


625 










630 








635 




640 


Phe 


Ala 


Pro 


Lys 


Cys 


Phe 


lie 


He 


Leu 


Leu Arg Pro Glu 


Lys 


Lys 


Ser 










645 










650 




655 




Phe 


Gin 


Lys 


Phe 


Gin 


Asn 


He 


His 


Ser 


Lys He 









660 665 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4467 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
(ix) FEATURE: 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 126. . .2723 

(D) OTHER INFORMATION: GoVN4 



m (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

CAGGGATGAG GAAACACCTG TAGAAAAGGA AACCTGAATA CAGGTATAGC ATCTTCTTGG 60 
CCAGTGTAGA AGATGGGGAT AATTG CTACC TGTTTGCTGA TCTGTGCAGC AATTAACTAC 12 0 
CAATA ATG TCC AGG CTC AGA GCA GGA AAA AAT ATG CTC ACC TTC ATT TTA 170 
Met Ser Arg Leu Arg Ala Gly Lys Asn Met Leu Thr Phe He Leu 
15 10 15 

CTC TTC TTT CTC CTG AAC ATT CCA CTT TTT GTG CCT AGT TTT ATT TAT 218 
Leu Phe Phe Leu Leu Asn He Pro Leu Phe Val Pro Ser Phe He Tyr 
20 25 30 

CCC AGG TGC TTT TGG AGT ATG AAG AAG AAT GAA TAT CAG GAT AGA AAC 266 
Pro Arg Cys Phe Trp Ser Met Lys Lys Asn Glu Tyr Gin Asp Arg Asn 
35 40 45 

CTG GGA ACA GGT TGT ATG TTC TTT ATT CTA GCA GTG CAA CAG CCT ATG 314 
Leu Gly Thr Gly Cys Met Phe Phe He Leu Ala Val Gin Gin Pro Met 
50 55 60 

GAA AAA GAG TAT TTC AGT CAT ATT TCG AAT ATA CAA ACA CCT ACT GAA 362 
Glu Lys Glu Tyr Phe Ser His He Ser Asn He Gin Thr Pro Thr Glu 
65 70 75 

AAC CAA AAG TAT CCT CTC ACC TTG GCT TTT TCC ATG AAT GAA ATC AAC 410 
Asn Gin Lys Tyr Pro Leu Thr Leu Ala Phe Ser Met Asn Glu He Asn 
80 85 90 95 

AAC AAC CCT GAT CTT TTG CCA AAT ATG TCT TTA GCA TTT ACA TTC TCA 458 
Asn Asn Pro Asp Leu Leu Pro Asn Met Ser Leu Ala Phe Thr Phe Ser 
100 105 110 

GAA TAT AGT TGT TAT TTG GAA TCC CAC CAC AAA AGA TTA TTT AAT TTT 506 
Glu Tyr Ser Cys Tyr Leu Glu Ser His His Lys Arg Leu Phe Asn Phe 
115 120 125 

TCT TTA AAA AAT CAT GAA ATT CTC CCT AAT TTT ATC TGT ACA AAA GAC 554 
Ser Leu Lys Asn His Glu He Leu Pro Asn Phe He Cys Thr Lys Asp 
130 135 140 
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ATC AAG TGT GGA GTG GTA CTT ACC GGA CTT AGT TTG GTA ACA ACT GTG 602 
lie Lys Cys Gly Val Val Leu Thr Gly Leu Ser Leu Val Thr Thr Val 
145 150 155 

ACA CTT CAT ATA ATC CTA AAC AAT TTC ATA TTT CAG CAG TTC CGT CAG 650 
Thr Leu His He He Leu Asn Asn Phe He Phe Gin Gin Phe Arg Gin 
160 165 170 175 

CTT ACT TAT GGA CAC TTT CAT CCT GCT CTG TGT GAT CAT GAA AAT TTT 698 
Leu Thr Tyr Gly His Phe His Pro Ala Leu Cys Asp His Glu Asn Phe 
180 185 190 

CCT CAT CTA TAT CAG ATG GCC TCT GAT GAT ACA TCT CTA GCC CTT GCT 746 
Pro His Leu Tyr Gin Met Ala Ser Asp Asp Thr Ser Leu Ala Leu Ala 
195 200 205 

CTC GTC TCC TTC ATA ATT CAT TTC AGT TGG AAC TGG ATA GGG TTG GCC 794 
Leu Val Ser Phe He He His Phe Ser Trp Asn Trp He Gly Leu Ala 
210 215 220 

ATC TCA GAC AAT GAT CAA GGC ATA CAT TTT CTC TCT TAT TTG AGA AGA 842 
He Ser Asp Asn Asp Gin Gly He His Phe Leu Ser Tyr Leu Arg Arg 
225 230 235 

GAG ATG GAA AAA AAT ACA GTC TGC TTT GCC TTT GTC AAC ATT ATT CCA 890 
Glu Met Glu Lys Asn Thr Val Cys Phe Ala Phe Val Asn He He Pro 
240 245 250 255 

GTC AAT ATG AAT TTA TAC ATG TCA AGA GCT GAA GTG TAT TAC AGC CAA 938 
Val Asn Met Asn Leu Tyr Met Ser Arg Ala Glu Val Tyr Tyr Ser Gin 
260 265 270 

GTT ATG ACA TCA TCC GCA AAT GTT GTT ATC ATT TAT GGT GAT ACA GGG 986 
Val Met Thr Ser Ser Ala Asn Val Val He He Tyr Gly Asp Thr Gly 
275 280 285 

AAT ACG TTA GCT GTG AGC TTT AGA ATG TGG GAC TCT CTA GGT ATA CAG 1034 
Asn Thr Leu Ala Val Ser Phe Arg Met Trp Asp Ser Leu Gly He Gin 
290 295 300 

AGA CTA TGG GTC ACC ACC TCA CAG TGG GAT GTC ACT CCT TTT AAG AAA 1082 
Arg Leu Trp Val Thr Thr Ser Gin Trp Asp Val Thr Pro Phe Lys Lys 
305 310 315 

GAC TTC ACA TTT GAT AAT GGA TAT GGA ACT TTT GGT TTT GGA CAC CGC 1130 
Asp Phe Thr Phe Asp Asn Gly Tyr Gly Thr Phe Gly Phe Gly His Arg 
320 325 330 335 

CAC AGT GAG ATT TCT GGT TTT AAA TAT TTT GTT CAG ACA TTG AAC CCT 1178 
His Ser Glu He Ser Gly Phe Lys Tyr Phe Val Gin Thr Leu Asn Pro 
340 345 350 

TTC AAA TAC TCA GAT GAA TAT TTG GTA AAG CTG GAA TGG ATG TAT GTT 1226 
Phe Lys Tyr Ser Asp Glu Tyr Leu Val Lys Leu Glu Trp Met Tyr Val 
355 360 365 

AAT TGT AAA ATC TTA GAA TAT AAC TGT AAG TCA CTG AAG AAC TGC TCC 1274 
Asn Cys Lys He Leu Glu Tyr Asn Cys Lys Ser Leu Lys Asn Cys Ser 
370 375 . 380 

TTT AAT CAC TCA TTG GAA TGG CTA ATG ACA CAT ACT TTT GAC ATG GCC 1322 
Phe Asn His Ser Leu Glu Trp Leu Met Thr His Thr Phe Asp Met Ala 
385 390 395 



ATT ATT GAA GGG AGT TAT GAA ATA TAC AAT GCT GTG TAT GCT TTT GCC 



1370 
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Ile He Glu Gly Ser Tyr Glu He Tyr Asn Ala Val Tyr Ala Phe Ala 
400 405 410 415 

CAT GCA CTC CAT GAG ATG ACT CTT CAA AAT GTT GAT AAT GTT CTC CTT 1418 
His Ala Leu His Glu Met Thr Leu Gin Asn Val Asp Asn Val Leu Leu 
420 425 430 

CCC AAT TAT GAA GAA CAA AAT TAT AAT TGC AAG ATG GTT TAT TCC TTT 1466 
Pro Asn Tyr Glu Glu Gin Asn Tyr Asn Cys Lys Met Val Tyr Ser Phe 
435 440 445 

CTG AGC AAG ACT CAA TTC ACA AAT CCT GTT GGA GAC ACT GTG AAT ATG 1514 
Leu Ser Lys Thr Gin Phe Thr Asn Pro Val Gly Asp Thr Val Asn Met 
450 455 460 

AAT CAA AGA AAC AAA CTG AAG GAA GAG TAC GAC ATT TTC TAC AAT TGG 1562 
Asn Gin Arg Asn Lys Leu Lys Glu Glu Tyr Asp He Phe Tyr Asn Trp 
465 470 475 

AAT TTT CCA CAG GGA CTT GGA TTT AAA GTG AAA ATA GGA ATA TTT AGT 1610 
Asn Phe Pro Gin Gly Leu Gly Phe Lys Val Lys He Gly He Phe Ser 
480 485 490 495 

CCA TAT TTT CCA AAA GGT CAA CAG CTT CAT TTA TCT GAA AAT CTG ATA 1658 
Pro Tyr Phe Pro Lys Gly Gin Gin Leu Hie Leu Ser Glu Asn Leu He 
500 505 510 

GAG TGG TCC ACA GGA CGT ATA CAG ATG CCA ACC TCT GTG TGC AGT GCC 1706 
Glu Trp Ser Thr Gly Arg He Gin Met Pro Thr Ser Val Cys Ser Ala 
515 520 525 

GAT TGT GGT CCT GGA TTT AGG AAA GTC TGG AAG AAT GGA ATG CCA GCC 1754 
Asp Cys Gly Pro Gly Phe Arg Lys Val Trp Lys Asn Gly Met Pro. Ala 
530 535 540 

TGT TGT TTT GAC TGC AGT CCC TGC CCA GAA AAT GAA ATT TCT AAT GAG 1802 
Cys Cys Phe Asp Cys Ser Pro Cys Pro Glu Asn Glu He Ser Asn Glu 
545 550 555 

ACA AAT GTG GAA TTG TGT GTC CAG TGT CCA GAG GAC CAA TAT GCT AAC 1850 
Thr Asn Val Glu Leu Cys Val Gin Cys Pro Glu Asp Gin Tyr Ala Asn 
560 565 570 575 

CAA GAG CAG AAT CAC TGC ATT CAC AAA GCT CGT ATC TTT CTC TCT TAT 1898 
Gin Glu Gin Asn His Cys He His Lys Ala Arg He Phe Leu Ser Tyr 
580 585 590 

GAT GAA CCC TTG GGG ATG GCT CTT TCC TTA ATG GCC TTA TGC CTC GCT 1946 
Asp Glu Pro Leu Gly Met Ala Leu Ser Leu Met Ala Leu Cys Leu Ala 
595 600 605 

GCA CTC ACA GTT GTG GTT CTT GGA GTC TTT GTG AAA CAT CAC AGA ACT 1994 
Ala Leu Thr Val Val Val Leu Gly Val Phe Val Lys His His Arg Thr 
610 615 620 

CCC ATA GTT AAG GCC AAT AAC TGC ACT CTC ACC TAC ATC TTG CTC ATC 2042 
Pro He Val Lys Ala Asn Asn Cys Thr Leu Thr Tyr He Leu Leu He 
625 630 635 

GCA CTC ATC TTT TGT TTC CTC TGC CCC TTG TTC TTC ATT GGC CAT CCA 2090 
Ala Leu He Phe Cys Phe Leu Cys Pro Leu Phe Phe He Gly His Pro 
640 645 650 655 



AAC TCA GCT ACC TGC ATC CTT CAG CAA ATC ACA TTT GGA GTT GTG TTC 
Asn Ser Ala Thr Cys He Leu Gin Gin He Thr Phe Gly Val Val Phe 



2138 
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ACT GTG GCT ATT TCC ACT GTG TTG GCC AAA ACA ACC ACT GTC ATT CTG 2186 
Thr Val Ala lie Ser Thr Val Leu Ala Lys Thr Thr Thr Val lie Leu 
675 680 685 

GCT TTC AGA GTC ACA GCC CCT CAT AGA ATG ATG AAG TAC TTT CTT GTT 2234 
Ala Phe Arg Val Thr Ala Pro His Arg Met Met Lys Tyr Phe Leu Val 
690 695 700 

TCA AGG GCA TCT AAC TAC ATC ATT CCC ATT TGT ACT CTC ATT CAA ATT 2282 
Ser Arg Ala Ser Asn Tyr lie lie Pro lie Cys Thr Leu lie Gin lie 
705 710 715 

ATT GTA TGT GCC ATC TGG CTA GGA GCT TCT CCT CCT TCT GTT GAT ATT 2330 
lie Val Cys Ala lie Trp Leu Gly Ala Ser Pro Pro Ser Val Asp lie 
720 725 730 735 

GAT GCA CAG TCT GAG CAT GGT CAC ATC ATC ATT GCT TGC AAC AAG GGT 2378 
Asp Ala Gin Ser Glu His Gly Kis He He He Ala Cys Asn L w s G1** 
740 745 750 * 

TCA GTC ACT GCT TTT TAC TGT GTC CTG GGA TAT CTG GCC TGC CTG GCC 2426 
Ser Val Thr Ala Phe Tyr Cys Val Leu Gly Tyr Leu Ala Cys Leu Ala 
755 760 765 

TTT GTG AGC TTC ACC CTG GCT TTC CTT TCC AGA AAC CTG CCT GTC ACC 2474 
Phe Val Ser Phe Thr Leu Ala Phe Leu Ser Arg Asn Leu Pro Val Thr 
770 775 780 

TTC AAT GAA GCC AAG TCC ATG ACA TTC AGC ATG CTG GTG TTC TGC AGT 2522 
Phe Asn Glu Ala Lys Ser Met Thr Phe Ser Met Leu Val Phe Cys Ser 
785 790 795 

GTC TGG GTC ACT TTC CTA CCT GTT TAC CAT GGC ACC AAA GGC AAG GTT 2570 
Val Trp Val Thr Phe Leu Pro Val Tyr His Gly Thr Lys Gly Lys Val 
800 805 810 815 

ATG GTG GCT GTT GAG ATC TTT TCC ACC TTG GCT TCT AGT GCA GGA ATG 2618 
Met Val Ala Val Glu He Phe Ser Thr Leu Ala Ser Ser Ala Gly Met 
820 825 830 

TTG GGA TGC ATT TTT GCT CCA AAA TGC TAC ACA ATA CTG TTT AGA CCA 2666 
Leu Gly Cys He Phe Ala Pro Lys Cys Tyr Thr He Leu Phe Arg Pro 
835 840 845 

GAC AGA AAT TCT CTT CAA ATG ATC AGG GAG AAG TCA TCT TCT CAT ACT 2714 
Asp Arg Asn Ser Leu Gin Met He Arg Glu Lys Ser Ser Ser His Thr 
850 855 860 

CAC ATT TTA TAAAGTCTGA CTGACACAGG CATTGTTGGT TCATAATCAC CAAATATTC 2772 
His He Leu 
665 

GATTACATTG CCATATCTAT TTTTAGAATG ACTGTCACTG TTCCCTTTGA TGATATTGCG 2832 

TAGCAAGATC ATGT CTA CTG AGGACTACCT TATCTCCTAT AATCTTCCAA CATTTTCTAC 2892 

ATCAATCCTA CTCTTTTAGA GAAAGAGATA ATAGAATTTT AAACATTTTC AGAATTAGAG 2952 

TTCTTCTAGG AACAGAGAAG AGAAAGAATT ATTTTTTCAA CAGGTTGATA GAATATCAGG 3012 

AAAGGGGTTG AAGTCACAAC AATATAAATA AAGCCCTGCT CTTGTATAGG AACTTATGAA 3072 

TACTCAATCC CACCAACTAC CATTAACAAC CACATGTAAC AAATGTTAAA AAGGATCAGA 3132 

TGGTTTCTTA TTGTCTCCAA ATTTGCCTGA ACTTATTTAT GCACATAATG AGACACACAC 3192 

ACACACACAC ACAAACACAC ACACAAATAC AAATTCCATA AAATTTTAAA AATATAGAAT 3252 

ATTACAAAGA CTTAACACTG GCAATCTGCT CTTCAATGTT CATAATTACA GGAACTTACA 3312 

GGAAAATATG GGACATAGGT AGAGATGACT GGGTTTATGT TAAGTCATTT TAAATAAGAA 3372 

CCCTCAATTT TAAGTGTATC ATAAAAGACA CAGTTGTGAA ATTTTCAAGG ACAGCACTAC 3432 
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TTGTTGAAAT AATCTCCATC TGTGGAATTT ATAGGGTTTT GTGACAAAGA TCAGTTCTGA 3492 

TATCAGAGAG TAAACTGAAG CAGGCAACCA TTAGTTGTCA GCACTGACAG CAGCTAATGG 3552 

AGGTTGCTTC AGAAATCAAT TGAGGTTGAT TCTGGCAATG AGCAGTTAGA GAAGATAAAA 3612 

AACAGGGAAA TCAAATATTC ACACACACAC ACACACACAC ACGTACACTC ACATGCACAA 3672 

GCAAGTGCAT GCATGCAAAC CCACACAGAC TACTTGAAGC AAAGGCAAGG TCCAGCCACT 3732 

TGAAACATAC AAATGTGTAC ATATAGACAG ACACAGACAA ACACATACAT ATCCACATGT 3792 

TAAATGGCTG GAGCAATGTC AGCCAGCAGG CTCCATGTAT TTCACATATG TACATATATG 3852 

CATGTAAATA AATATTCAGA TATACACATA TTCACATGTA CTGGTGGGTA GGTGGAATAA 3912 

AGTTCCAAAA AACAGGCCCC AGGAATTTTA CACATAATGT ACAGACATAT ATAACACTAT 3972 

TGGTGGAAGA ACAAGCTCCA ACATATTCAG GGAAGCATTG CATATACATA CATATAGATT 4032 

TGATGGATGG AACAAAGTTC CAACAAATTC TCACATGAAC TTTATATATG TATATACATG 4092 

AAAGGCAGCC TGGTTCCCAG TTGATCAGAG GTTTGAAAGC CCAGTGACCC TAAAAAAGAT 4152 

GGTAGCCATT TAGCCTGATT CCCAGTAAAC CAGGCAAGTC ACTAGCCACA GCCCTCCATA 4212 

GAATTTTGGC CATCAGTCAC TTAAGCCCAA CACCCTCCAC AGATTAAAGG AAGTGATTAC 4272 

AGGTCACAGG G ACT CAGAAC ACATTTCCAT TATGTGACAT AGTCAAAGAC TTGGAGACTT 4332 

AGCC AAT GAA CTTTCCTTCC CTGAAACTCC TCCCTGCAGG CCAACCTTGA AAAGAGGGGT 4392 

ATGGTTTTAC TCATCTGCTT TCAGCCATGA CAATAAATGA CTTAAAACAA TGAAAAAAAA 4452 

AAAAAAAAAA AAAAA 4467 

(2) INFORMATION FOR SEQ ID NO:40r 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 666 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 



Met 


Ser 


Arg 


Leu 


Arg 


Ala 


Gly 


Lys 


Asn 


Met 


Leu 


Thr 


Phe 


He 


Leu 


Leu 


1 








5 










10 










15 




Phe 


Phe 


Leu 


Leu 


Asn 


He 


Pro 


Leu 


Phe 


Val 


Pro 


Ser 


Phe 


He 


Tyr 


Pro 








20 










25 










30 




Arg 


Cys 


Phe 
35 


Trp 


Ser 


Met 


Lys 


Lys 
40 


Asn 


Glu 


Tyr 


Gin 


Asp 
45 


Arg 


Asn 


Leu 


Gly 


Thr 
50 


Gly 


Cys 


Met 


Phe 


Phe 
55 


He 


Leu 


Ala 


Val 


Gin 
60 


Gin 


Pro 


Met 


Glu 


Lys 


Glu 


Tyr 


Phe 


Ser 


His 


He 


Ser 


Asn 


He 


Gin 


Thr 


Pro 


Thr 


Glu 


Asn 


65 










70 










75 










80 


Gin 


Lys 


Tyr 


Pro 


Leu 
85 


Thr 


Leu 


Ala 


Phe 


Ser 
90 


Met 


Asn 


Glu 


He 


Asn 
95 


Asn 


Asn 


Pro 


Asp 


Leu 
100 


Leu 


Pro 


Asn 


Met 


Ser 
105 


Leu 


Ala 


Phe 


Thr 


Phe 
110 


Ser 


Glu 


Tyr 


Ser 


Cys 
115 


Tyr 


Leu 


Glu 


Ser 


His 
120 


His 


Lys 


Arg 


Leu 


Phe 
125 


Asn 


Phe 


Ser 


Leu 


Lys 


Asn 


His 


Glu 


He 


Leu 


Pro 


Asn 


Phe 


He 


Cys 


Thr 


Lys 


Asp 


lie 




130 










135 










140 






Lys 


Cys 


Gly 


Val 


Val 


Leu 


Thr 


Gly 


Leu 


Ser 


Leu 


val 


Thr 


Thr 


Val 


Thr 


145 










150 










155 










160 


Leu 


His 


lie 


He 


Leu 


Asn 


Asn 


Phe 


He 


Phe 


Gin 


Gin 


Phe 


Arg 


Gin 


Leu 










165 










170 








175 




Thr 


Tyr 


Gly 


His 


Phe 


His 


Pro 


Ala 


Leu 


Cys 


Asp 


His 


Glu 


Asn 


Phe 


Pro 


His 






180 










185 










190 






Leu 


Tyr 
195 


Gin 


Met 


Ala 


Ser 


Asp 
200 


Asp 


Thr 


Ser 


Leu 


Ala 
205 


Leu 


Ala 


Leu 


Val 


Ser 


Phe 


He 


He 


His 


Phe 


Ser 


Trp 


Asn 


Trp 


He 


Gly 


Leu 


Ala 


He 




210 










215 










220 








Ser 


Asp 


Asn 


Asp 


Gin 


Gly 


He 


His 


Phe 


Leu 


Ser 


Tyr 


Leu 


Arg 


Arg 


Glu 


225 










230 










235 




240 


Met 


Glu 


Lys 


Asn 


Thr 
245 


Val 


Cys 


Phe 


Ala 


Phe 
250 


Val 


Asn 


He 


He 


Pro 
255 


Val 


Asn 


Met 


Asn 


Leu 


Tyr 


Met 


Ser 


Arg 


Ala 


Glu 


Val 


Tyr 


Tyr 


Ser 


Gin 


Val 
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260 



Met 


Thr 


Ser 


Ser 


Ala 


Asn 


Val 


Val 






275 










280 


Thr 


Leu 


Ala 


Val 


Ser 


Phe 


Arcr 


Met 




290 










295 




Leu 


Trp 


Val 


Thr 


Thr 


Ser 


Gin 


Tro 


305 










310 




Phe 


Thr 


Phe 




Asn 


Glv 


Tvr 
y 


Glv 










325 








Ser 


Glu 


lie 


Ser 


Glv 


Phe 


■ Lvs 


Tvr 








340 










Lvs 


Tvr 


Ser 


Asp 


Glu 


Tvr 


Leu 


Val 






355 










360 


Cvs 


Lys 


lie 


Leu 


Glu 


Tvr 


Asn 


Cys 




370 










375 




Asn 


His 


Ser 


Leu 


Glu 




Leu 


Met 


385 










390 






lie 


Glu 


Glv 


Ser 


Tvr 


Glu 


He 


Tvr 










405 








Ala 


Leu 


His 


Glu 


Met 


Thr 


Leu 


Gin 








420 










Asn 


Tvr 


Glu 


Glu 


Gin 


Asn 


Tvr 


Asn 






435 










440 


Ser 


Lvs 


Thr 


Gin 


Phe 


Thr 


Asn 


Pro 




450 










455 




Gin 


Arcr 


Asn 


Lvs 


Leu 


Lvs 


Glu 


Glu 


465 










470 






Phe 


Pro 


Gin 


Glv 


Leu 


Glv 


Phe 


Lys 










485 








Tyr 


Phe 


Pro 


Lvs 


Glv 


Gin 


Gin 


Leu 








500 










Trp 


Ser 


Thr 


Glv 


Arcr 


lie 


Gin 


Met 






515 










520 


Cvs 


Gly 


Pro 


Glv 


Phe 


Arcr 


Lys 


Val 




530 










535 




Cvs 


Phe 


Asp 


Cvs 


Ser 


Pro 


Cvs 


Pro 


545 










550 






Asn 


Val 


Glu 


Leu 


Cys 


Val 


Gin 


Cys 










565 








Glu 


Gin 


Asn 


His 


Cvs 


lie 


His 


Lys 








580 










Glu 


Pro 


Leu 


Gly 


Met 


Ala 


Leu 


Ser 






595 










600 


Leu 


Thr 


Val 


Val 


Val 


Leu 


Glv 


Val 




610 










615 




lie 


Val 


Lys 


Ala 


Asn 


Asn 


Cys 


Thr 


625 










630 






Leu 


lie 


Phe 


Cvs 


Phe 


Leu 


Cys 


Pro 










645 








Ser 


Ala 


Thr 


Cys 


lie 


Leu 


Gin 


Gin 








660 










Val 


Ala 


lie 


Ser 


Thr 


Val 


Leu 


Ala 






675 










680 


Phe 


Arg 


Val 


Thr 


Ala 


Pro 


His 


Arg 




690 










695 


Arg 


Ala 


Ser 


Asn 


Tyr 


He 


He 


Pro 


705 










710 






Val 


Cys 


Ala 


lie 


Trp 


Leu 


Gly 


Ala 










725 








Ala 


Gin 


Ser 


Glu 


His 


Gly 


His 


He 








740 










Val 


Thr 


Ala 


Phe 


Tyr 


Cys 


Val 


Leu 






755 










760 


Val 


Ser 


Phe 


Thr 


Leu 


Ala 


Phe 


Leu 




770 










775 
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265 270 



He 


He 


Tvr 


Glv 


Asp 


Thr 


Glv 


Asn 










285 








Trp 


Asp 


Ser 


Leu 


Glv 


He 


Gin 










300 










Asp 


Val 


Thr 


Pro 


Phe 


lay a 


Lys 


Asp 






315 










J A U 


Thr 


Phe 


Glv 


Phe 


Glv 


His 


Arg 


nip 




330 










335 




Phe 


Val 


Gin 


Thr 


Leu 


Asn 




trllts 


345 










350 






Lys 


Leu 


Glu 




Met 


±yr 


V Gil 


Asn 










365 








Lys 


Ser 


Leu 


Lys 


& en 
noil 


v»yt» 


e 


It I Hz 








380 










Thr 


His 


Thr 


Phe 


Asp 


Met 


Hia 


IXC 






395 










Ann 


Asn 


Ala 


Val 




Ala 


Phe 


Ala 


His 




410 










415 




Asn 


Val 


Asp 


Asn 


Val 


Leu 


Leu 


Pro 


425 










430 






Cvs 


Lys 


Met 


Val 




Ser 


lr lie 


Leu 










445 








Val 


Glv 


Asp 


Thr 


Val 


Asn 


fits u 


Asn 








460 










Tvr 


Asp 


He 


Phe 


Tvr 

xyt 


Asn 


Tm 


noil 






475 










480 


Val 


Lvs 


He 


Glv 


He 


Phe 


Ser 


Pro 




490 










495 




His 


Leu 


Ser 


Glu 


Asn 


Leu 


Tip 


Glu 


505 










510 






Pro 


Thr 


Ser 


Val 


Cys 


Ser 


Ala 


p 










525 








Trp 


Lys 


Asn 


Glv 


Met 


Pro 


Ala 










540 










Glu 


Asn 


Glu 


He 


Ser 


Asn 


Glu 


Thr 






555 










560 


Pro 


Glu 


Asp 


Gin 


Tvr 


Ala 


Asn 


Gin 




570 










575 




Ala 


Arcr 


He 


Phe 


Leu 


Ser 


xyx 


Asp 


585 










590 






Leu 


Met 


Ala 


Leu 


v-ye 


Leu 


Ala 


Ala 










605 








Phe 


Val 


Lvs 


His 


His 




Thr 


Pro 








620 










Leu 


Thr 


Tvr 


He 


Leu 


Leu 


He 


Ala 






635 










640 


Leu 


Phe 


Phe 


He 


Glv 


His 


Pro 


Asn 




650 










655 




He 


Thr 


Phe 


Glv 


Val 


Val 


Phe 


Thr 


665 










670 






Lys 


Thr 


Thr 


Thr 


Val 


He 


Leu 


Ala 










685 








Met 


Met 


Lys 


Tyr 


Phe 


Leu 


Val 


Ser 








700 










He 


Cys 


Thr 


Leu 


He 


Gin 


He 


He 






715 










720 


Ser 


Pro 


Pro 


Ser 


Val 


Asp 


He 


Asp 




730 










735 




He 


lie 


Ala 


Cys 


Asn 


Lys 


Gly 


Ser 


745 










750 






Gly 


Tyr 


Leu 


Ala 


Cys 


Leu 


Ala 


Phe 










765 








Ser 


Arg 


Asn 


Leu 


Pro 


Val 


Thr 


Phe 



780 
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Asn 


Glu 


Ala 


Lys 


Ser 


Met 


Thr 


Phe 


Ser 


Met 


Leu 


Val 


Phe 


Cys 


Ser 


Val 


785 










790 










795 








800 


Trp 


Val 


Thr 


Phe 


Leu 


Pro 


Val 


Tyr 


His 


Gly Thr 


Lys 


Gly Lys 


Val 


Met 










805 










810 










815 




Val 


Ala 


Val 


Glu 
820 


He 


Phe 


Ser 


Thr 


Leu 
825 


Ala 


Ser 


Ser 


Ala 


Gly 
830 


Met 


.Leu 


Gly 


Cys 


He 


Phe 


Ala 


Pro 


Lys 


Cys 


Tyr 


Thr 


He 


Leu 


Phe 


Arg 


Pro Asp 






835 










840 










845 








Arg 


Asn 
850 


Ser 


Leu 


Gin 


Met 


lie 
855 


Arg 


Glu 


Lys 


Ser 


Ser 
860 


Ser 


His 


Thr 


His 



He Leu 



865 

(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2916 base pairs 

(B) TYPE: nucleic acid 
<C> STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(ix) FEATURE: 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 299... 2635 

(D) OTHER INFORMATION: GoVNS 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:41: 

CGGCACGAGT TCAACTAGTC ATGTTCAAGA AGGGGCAAAT ACTTTGTTAA TATGCTCTTC 60 
GCTTGGACTT TTATCTCTTG CTTTCTGCAG ATTCCAATTA TTTTATGCTC CTACAGAAGC 120 
AGCGAGTGCT TAGTCAAGAT GAATTATCGT TTAAAGGGGA AAGGAAATGT GGTGATTGTT 180 
GGATTTTTCC CTGCTTTTGC TGTCTACCCC CTCAACAAAA CAATTGACTG GTGGATGCTT 240 
AAATTCAGCA AAGAATTATG ATTGAGTTTA AGTTGAAGAG CTACCAGTAT ATTTGGCC AT 300 

Met 
1 

GAG GTT TGC CAT TGA GGA AAT CAA CAG CAA TCC CCA TCT TTT ACC AAA 348 
Arg Phe Ala He Glu Glu He Asn Ser Asn Pro His Leu Leu Pro Asn 
5 10 15 

CAC ATC CCT GGG ATT TGA GAT CAA TAA TGT CCC ACA CGG TCA GAG GTA 396 
Thr Ser Leu Gly Phe Glu He Asn Asn Val Pro His Gly Gin Arg Tyr 
20 25 30 

CAC TCT GGT CAA ACT TTT TAG CTC ACT TTC AGG GTC TAA TTA TGA CAT 444 
Thr Leu Val Lys Leu Phe Ser Ser Leu Ser Gly Ser Asn Tyr Asp He 
35 40 45 

TCC TAA CTA CAT AAG TGC AAG TGA GAG CAA TTC TGC TGC TGT ACT TAC 492 
Pro Asn Tyr He Ser Ala Ser Glu Ser Asn Ser Ala Ala Val Leu Thr 
50 55 60 65 

AGG ACC ATC GTG GAC AAT ATC TGA ATG CGT AGG GAC ACT CCT GGA TCT 54 0 

Gly Pro Ser Trp Thr He Ser Glu Cys Val Gly Thr Leu Leu Asp Leu 
70 75 80 

TTA CAA ATT TCC ACA GCT TAC TTT TGG GCC TTT TGA TAG TCT CCT GAG 588 
Tyr Lys Phe Pro Gin Leu Thr Phe Gly Pro Phe Asp Ser Leu Leu Ser 
85 90 95 



TGA ACA AAG ACG GTT TTC TTC TCT GTA CCA AGT GGC CCC CAA AGA TAC 
Glu Gin Arg Arg Phe Ser Ser Leu Tyr Gin Val Ala Pro Lys Asp Thr. 



636 
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100 105 110 

ATT TCT GAC GCC TGG CAT TGT ATC TTT GAT GCT TCA TTT CCA CTG GAA 684 
Phe Leu Thr Pro Gly lie Val Ser Leu Met Leu His Phe His Trp Asn 
115 120 125 

CTG GGT GGG GTT ATT CAT CAT AGA TGA TGA CAA AGG TGC CCA GAC ACT 732 
Trp Val Gly Leu Phe lie lie Asp Asp Asp Lys Gly Ala Gin Thr Leu 
130 135 140 14 

GTC AGA CTT GAG AAA TGA GAT GGA TAA AAA TGG AGT CTG CAC AGC ATT 780 
Ser Asp Leu Arg Asn Glu Met Asp Lys Asn Gly Val Cys Thr Ala Phe 
5 150 155 160 

TGT AGA AAT GAT CCC AGT CAT CAA GGG TTC ATT TTT TAC CAA ATC CTG 828 
Val Glu Met lie Pro Val lie Lys Gly Ser Phe Phe Thr Lys Ser Trp 
165 170 175 

GAA AAA TCA TGT GCA GAT CCT GGA ATC ATC ATC AAA TGT GAT TAT TAT 876 
Lys Asn His Val Gin lie Leu Glu Ser Ser Ser Asn Val lie lie lie 
180 185 190 

TTA TGG GGA CTC TGA TTC TCT ATT AAG CTT AAT AGT AAA. TAT TAA GCA 924 
Tyr Gly Asp Ser Asp Ser Leu Leu Ser Leu He Val Asn He Lys Gin 
195 200 205 

GAA GTT GCT CAC ATG GAA AGT GTG GGT ACT GAT CTC ACA GTG GGA TGT 972 
Lys Leu Leu Thr Trp Lys Val Trp Val Leu He Ser Gin Trp Asp Val 
210 215 220 22 

TTC TAA ATT TGA TGA TTA TTT CAT GGT AGA CTC ATT GCA TGG AGC TCT 1020 
Ser Lys Phe Asp Asp Tyr Phe Met Val Asp Ser Leu His Gly Ala Leu 
5 230 235 240 

TAT TTT TTC ACA CCA TCG TGA GGA GAT TCC TAA TTT TAC AGA TTT TAT 1068 
He Phe Ser His His Arg Glu Glu He Pro Asn Phe Thr Asp Phe Met 
245 250 255 

GCA GAA GTA CAA CCC TTC CAA GTA CCC GGA AGA CAC TTA TCT TCA TGT 1116 
Gin Lys Tyr Asn Pro Ser Lys Tyr Pro Glu Asp Thr Tyr Leu His Val 
260 265 270 

ATT GTG GCA CAT GTA CTT CAA TTG CTC ATT TGT TAA GAA AGA TTG TAA 1164 
Leu Trp His Met Tyr Phe Asn Cys Ser Phe Val Lys Lys Asp Cys Lys 
275 280 285 

AAT TGT GCA CAA CTG TTT GCC TAA TGC CTC CCT GGG GTT CTT GCC TGG 1212 
He Val His Asn Cys Leu Pro Asn Ala Ser Leu Gly Phe Leu Pro Gly 
290 295 300 30 

GAA CAT ATT TGA CAT GGC CAT GAG TGA AGA GAG TTA CAA TGT ATA CAA 1260 
Asn He Phe Asp Met Ala Met Ser Glu Glu Ser Tyr Asn Val Tyr Asn 
5 310 315 320 

TGC TGT GTA TGC TGT GGC CCA CAG TCT GCA TGA GAT GAT TCT CAA CCA 1308 
Ala Val Tyr Ala Val Ala His Ser Leu His Glu Met He Leu Asn Gin 
325 330 335 

AGT ACA ATT TCA AAC TCA TGA AAA AGG AAA AAA GAT GGT ATT CTT TCC 1356 
Val Gin Phe Gin Thr His Glu Lys Gly Lys Lys Met Val Phe Phe Pro 
340 345 350 

TTG GCA GCT TCA CCC CTT TCT AAG GGA AAG ACA ACT CAT CAA TCA GAA 1404 
Trp Gin Leu His Pro Phe Leu Arg Glu Arg Gin Leu He Asn Gin Asn 
355 360 365 
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TGG AGC GAA TGA AGA TCT GGA TTG TAC CAG GAA GTC ACA TGT AGA GTA 1452 
Gly Ala Asn Glu Asp Leu Asp Cys Thr Arg Lys Ser His Val Glu Tyr 
370 375 380 38 

TGA CAT TCT CAA CTT TTG GAA TTT CCC AAA AGG TCT TGG GCT AAA TGT 1500 
Asp He Leu Asn Phe Trp Asn Phe Pro Lys Gly Leu Gly Leu Asn Val 
5 390 395 400 

GAA AGT AGG AAC GTT TTC TCC AAG TGC TCC AAA GGA ACA GAA ACT GTC 1548 
Lys Val Gly Thr Phe Ser Pro Ser Ala Pro Lys Glu Gin Lys Leu Ser 
405 410 415 

CAT ATC TTC TAA CAT GAT ACA GTG GGC CAC AGG GTC GAC AGA GAT TCC 1596 
He Ser Ser Asn Met He Gin Trp Ala Thr Gly Ser Thr Glu He Pro 
420 425 430 

ACA GTC TGT ATG CAG TGA GAG CTG TCA TCC TGG ATT CAG GAA AAC CCA 1644 
Gin Ser Val Cys Ser Glu Ser Cys His Pro Gly Phe Arg Lys Thr His 
435 440 445 

CCA GGA AGG CAG GGT TGC CTG TTG CTT TGA CTG CAT TCC TTG TCC AGA 1692 
Gin Glu Gly Arg Val Ala Cys Cys Phe Asp Cys He Pro Cys Pro Glu 
450 455 460 46 

AAA TGA GAT CTC CAA TGA GAC AGA TGT GGA TCA GTG TGT GAA GTG TCC 174 0 
Asn Glu He Ser Asn Glu Thr Asp Val Asp Gin Cys Val Lys Cys Pro 
5 470 475 480 

AGA AAC TCA CTA TGC AAA CAT AGA GAA GAT CCA CTG CCT ACA GAA AAC 1788 
Glu Thr His Tyr Ala Asn He Glu Lys He His Cys Leu Gin Lys Thr 
485 490 495 

TGT GAC ATT TCT GTA CTA TGA TGA CCC ATT GGG GAA GAC ACT TTG CTT 1836 
Val Thr Phe Leu Tyr Tyr Asp Asp Pro Leu Gly Lys Thr Leu Cys Phe 
500 505 510 

CAT GTC CCT GGG TTT CTC CTC ACT CAC AGC TGC TGT TCT TGT GGT GTT 1884 
Met Ser Leu Gly Phe Ser Ser Leu Thr Ala Ala Val Leu Val Val Phe 
515 520 525 

TCT GAA GAA CAG GGA CAC CCC CAT TGT CAA GGC CAA TAA CCT GGC TCT 1932 
Leu Lys Asn Arg Asp Thr Pro He Val Lys Ala Asn Asn Leu Ala Leu 
530 535 540 54 

CAG TTA CAC CCT GCT CAT CAC TTT GAT GCT CTG TTT TCT CTG TCC CTT 1980 
Ser Tyr Thr Leu Leu He Thr Leu Met Leu Cys Phe Leu Cys Pro Leu 
5 550 555 560 

GCT CTT CAT TGG CCG TCC CAG CAC AGC CTC CTG TAT CCT GCA GCA AAA 2028 
Leu Phe He Gly Arg Pro Ser Thr Ala Ser Cys He Leu Gin Gin Asn 
565 570 575 

CAT TTT TGG GCT TCT GTT CAC TGT GGC TCT TTC CAC TGT GTT GGC CAA - 2076 
He Phe Gly Leu Leu Phe Thr Val Ala Leu Ser Thr Val Leu Ala Lys 
580 585 590 

AAC TAT CAC TGT GGT TAT AGC CTT CAA GAT CAC TTC TCC AGG AAG AAT 2124 
Thr He Thr Val Val He Ala Phe Lys He Thr Ser Pro Gly Arg He 
595 600 605 

TAG AAG ATG GCT GCT GAT ATC AAG GGC CCC TAA TTT CAT TAT TCC CTT 2172 
Arg Arg Trp Leu Leu He Ser Arg Ala Pro Asn Phe He He Pro Leu 
610 615 620 62 



ATG CAC CCT GCT CCA AGT TTT TCT ATC TGG AAT TTG GCT GAC AAC CTC 
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Cys Thr Leu Leu Gin Val Phe Leu Ser Gly He Trp Leu Thr Thr Ser 
5 630 635 640 

TCC TCC ATT TAT TGA TAA AGA TGC TCA CTC AGA ACA TGG ACA CAT CAT 2268 
Pro Pro Phe He Asp Lys Asp Ala His Ser Glu His Gly His He He 
645 650 655 

CAT CAT TTG CAA TAA AGG CTC AGC TGT TGC TTT CCA TTG CAA CCT TGG 2316 
He He Cys Asn Lys Gly Ser Ala Val Ala Phe His Cys Asn Leu Gly 
660 665 670 

ATA CCT GGG AGC ACT AGC CCT AGT GAG CTA CTT TAT GGC TTT CTT GTC 2364 
Tyr Leu Gly Ala Leu Ala Leu Val Ser Tyr Phe Met Ala Phe Leu Ser 
675 680 685 

CAG AAA CCT ACC TGA CAC ATT CAA TGA AGC CAA GTT CCT GGC TTT CAG 2412 
Arg Asn Leu Pro Asp Thr Phe Asn Glu Ala Lys Phe Leu Ala phe Ser 
690 695 700 70 

CAT GCT GGT GTT CTG CAG TGT CTG GGT CAC CTT CCT CCC TGT CTA CCA 2460 
Met Leu Val Phe Cys Ser Val Trp Val Thr Phe Leu Pro Val Tyr His 
5 710 715 720 

CAG CAC CAA GGG GAA GAA CAT GGT GGC TAT GGA AGT CTT CTC TAT CTT 2508 
Ser Thr Lys Gly Lys Asn Met Val Ala Met Glu Val Phe Ser He Leu 
725 730 735 

GGC TTC CAG TAC ATC TCT CCT. AGG CAT CAT CTT TGC CCC CAA GTG CTA 2556 
Ala Ser Ser Thr Ser Leu Leu Gly He He Phe Ala Pro Lys Cys Tyr 
740 745 750 

CCT CAT ATT ATT AAG ACC AGA AAG GAA TTC ACT TAG CTA TAT CAG GGA 2604 
Leu He Leu Leu Arg Pro Glu Arg Asn Ser Leu Ser Tyr He Arg Asp 
755 760 765 

CAA AAC ATA TGC TAA AAG CAT AAA ACC TTC T TAGCATCCTT ATGTGCCTCT T 2656 
Lys Thr Tyr Ala Lys Ser He Lys Pro Ser 
770 775 

AAATTAAACA GCATCATTGA AGGCAATTGT TGTTCTTCAC TATCTGAACA CTCACATATA 2716 

AAGTCATAAT TGTACATTTG ATCCAGGGGC TATTATTTCT TTAGTAGTCA TATATATGTA 2776 

CCTAATGCTT TTTTCACATT AAAATATGTG CTGCATTTTT CGTCTTCCTC TTCTACTTAC 2836 

TATTAGTTTT GTGCTATTGA TTTAACTTGC AATAAAATCC AAATTTCTGA GTTCTTCCAA 2896 

AAAAAAAAAA AAAAAAAAAA 2916 



(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 779 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



<ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:42: 

Met Arg Phe Ala He Glu Glu He Asn Ser Asn Pro His Leu Leu Pro 

1 5 10 15 

Asn Thr Ser Leu Gly Phe Glu He Asn Asn Val Pro His Gly Gin Arg 

20 25 30 

Tyr Thr Leu Val Lys Leu Phe Ser Ser Leu Ser Gly Ser Asn Tyr Asp 

35 40 45 

lie Pro Asn Tyr He Ser Ala Ser Glu Ser Asn Ser Ala Ala Val Leu . 
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Gly 


Asn 


Tie 
lie 


ir xie nop i le w 


AT a 
nia 


l lC L> 








JIU 






As xi 


nid 


Val 
val 


■Tvr TAl a Val 
x y 1 nla val 


Til a 
nla 


xl 1 0 








3 a 3 






Glr* 

will 


Veil 


Gl n 


PV10 f2l n Thr* 
xr lie vsiii x 111 


xxlo 


Gl 11 
ulu 














Pro 


A1 r 


Gin 


T .an His Pro 


Phe 


Leu 
















f3l -\r 


Hid 


noil VjxU nop 


Leu 


Asp 










^ / 3 




Tvr 


Asp 


He 


T.an IVqti Php 
Ucu x\aa Xr ix e 


lip 


Asn 








ion 






Val 


Lys 


Val 


Glv Thr" Phe 
w 1 y x ixx> xr lie 


Ser 


pro 








A ft C 
4 U3 








Tl a 
iiv 


Car 
oei 


Cnr Ann MIa^ 
del no XX flew 


Tl e 
lie 


111 








A *> ft 






Pro 


Gin 


Ser 


Val C*\T O Car 

vai v»yo oci 


Gl ti 

wl IX 


ser 






A Q C 






ji >i ft 

44 u 


HIS 


gi ti 

will 


bill 


Vjiy nxy vai 


nla 


Cys 










ice 
433 




Glu 


Asn 


Glu 


He Ser Asn 


Glu 


Thr 


465 






470 






Pro 


Glu 


Thr 


His Tyr Ala 


Asn 


He 








4B5 






Thr 


Val 


Thr 


Phe Leu Tyr 


Tyr 


Asp 








500 






Phe 


Met 


Ser 


Leu Gly Phe 


Ser 


Ser 






515 






520 


Phe 


Leu 


Lys 


Asn Arg Asp 


Thr 


Pro 




530 






535 




Leu 


Ser 


Tyr 


Thr Leu Leu 


He 


Thr 


545 






550 






Leu 


Leu 


Phe 


He Gly Arg 


Pro 


Ser 



565 
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60 



Glu 


Cys 


Val 


Gly 


Thr 


Leu 


Leu 


Asp 






75 










Qft 


Phe 


Glv 
01 y 


Pro 


Phe 


Asp 


Ser 


Leu 


Leu 




90 










27 3 




Lieu 


Tvr 


Gin 


Val 


Ala 


Pro 


Lys 


Asp 


105 










110 






Ser 


T.oi» 
UCU 


Met 


Leu 


His 


Phe 


xllS 


Trp 










125 






A an 
nop 


Asp 


Asp 


Lys 


Gly Ala 


Gin 


Thr 








140 










ASp 


_ 

i^ys 


Asn Gly 


Val 


Cys 


Thr 


Ala 






155 










160 


Lys 


Gl v 
uiy 


Ser 


Phe 


Phe 


Thr 


Lys 


Ser 




170 










175 




Gl 11 
ulU 


oei 


Ser 


Ser 


Asn 


Val 


He 


He 


IOC 
103 










190 






Leu 


ser 


Leu 


He 


vai 


Asn 


He 


Lys 










205 








Tm 


Val 


Leu 


He 


Ser 


Gin 




Asp 








220 










Mat* 


Val 
Val 


Asp 


Ser 


Leu 


His 


Gly Ala 






235 










240 


m 11 

wlU. 


Tl a 
lie 


Pro 


Asn 


Phe 


Thr 


Asp 


Phe 




jfi 3 \j 










255 




Tyr 




Glu Asp 


T*l*r 

rnr 


Tyr 


Leu 


His 


265 










270 






Cys 


_ 

ser 


Phe 


Val 




Lys 


Asp 


Cys 










285 








Asn 


m a 

nla 


Ser 


Leu 


wiy 


Phe 


Leu 


Pro 








300 










ser 


Gl 11 
wlU 


Glu 


Ser 


Tyr 


Asn 


Val 


Tyr 






315 










320 


Cay 


L6U 


His 


Glu 


Mat* 

wet 


He 


Leu 


Asn 




330 










335 




Lys 


Glv 
uiy 


Lys 


Lys 


Met 


Val 


Phe 


Phe 


345 










350 








Glu 


Arg 


Gin 


Leu 


He 


Asn 


Gin 










365 








uys 


XXll 


Arg 


Lys 


Ser 


His 


Val 


Glu 








380 










Phe 




Lys 


Gly 


Leu 


Gly Leu Asn 






395 










400 


Car 

oei 


Hid 


Pro 


Lys 


Glu 


Gin 


Lys 


Leu 




<11 ft 
41U 










415 




Trp 


nla 


Thr Gly 


Ser 


Thr 


Glu 


He 


*« A3 










430 






Cys 


niH 


Pro 


Gly 


Phe 


Arg 


Lys 


Thr 










445 








Cys 


xrne 


Asp 


Cys 


He 


Pro 


Cys 


Pro 








460 










Asp 


Val 


Asp Gin 


Cys 


Val 


Lys 


Cys 






475 










480 


Glu 


Lys 


He 


His 


Cys 


Leu 


Gin 


Lys 




490 










495 




Asp 


Pro 


Leu Gly 


Lys 


Thr 


Leu 


Cys 


505 










510 






Leu 


Thr 


Ala 


Ala 


Val 


Leu 


Val 


Val 










525 








He 


Val 


Lys 


Ala 


Asn 


Asn 


Leu 


Ala 








540 










Leu 


Met 


Leu Cys 


Phe 


Leu Cys 


Pro 






555 










560 


Thr 


Ala 


Ser 


Cys 


He 


Leu 


Gin 


Gin 




570 










575 
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Asn 


He 


Phe 


Gly 
580 


Leu 


Leu 


Phe 


Thr 


Val 
585 


Ala 


Leu 




J- ux 


VctX 


Leu 


Ala 


Lys 


Thr 


He 


Thr 


Val 


Val 


He 


Ala 


Phe 


Lys 


He 


Thr 


Ser 


Pro 


Gly Arg 






595 










600 










C ft c 








He 


Arg 


Arg 


Trp 

*** 


Leu 


Leu 


He 


Ser 




Ala 




Asn 


rac 


j.±e 


He 


Pro 




610 










615 








Oft 








Leu 


Cys 


Thr 


Leu 


Leu 


Gin 


Val 


Phe 


Leu 


Ser 


Gly 


He 


Trp 


Leu 


Thr 


Thr 


625 










630 










c *a c 

DJ3 








640 


Ser 


Pro 


Pro 


Phe 


He 


Asn 


Lvs 


Asp 


Ala 


His 




UlU 




uiy 


His 


He 










645 










CCA 








655 




He 


He 


He 


Cys 
660 


Asn 


Lvs 


Glv 


Ser 


Ala 


Val 


Ala 






Cys 
670 


Asn 


Leu 


Gly Tyr 


Leu 


Gly 


Ala 


Leu 


Ala 


Leu 


Val 


Ser 




Phe 


Met 


Ala 


Phe 


Leu 






675 










660 








685 








Ser 


Arg 
690 


Asn 


Leu 


Pro 


Asp 


Thr 
695 


Phe 


Asn 


Glu 


Ala 


Lys 
700 


Phe 


Leu 


Ala 


Phe 


Ser 


Met 


Leu 


val 


Phe 


Cys 


Ser 


Val 


Trp 


Val 


Thr 


Phe 


Leu 


Pro 


Val 


Tyr 


705 










710 










715 










720 


His 


Ser 


Thr 


Lys 


Gly 
725 


Lys 


Asn 


Met 


Val 


Ala 

730 


Met 


Glu 


Val 


Phe 


Ser 
735 


He 


Leu 


Ala 


Ser 


Ser 


Thr 


Ser 


Leu 


Leu 


Gly 


He 


He 


Phe 


Ala 


Pro 


Lys 


Cys 








740 










745 










750 


Tyr 


Leu 


He 


Leu 


Leu 


Arg 


Pro 


Glu 


Arg 


Asn 


Ser 


Leu 


Ser 


Tyr 


He 


Arg 






755 










760 










765 




Asp 


Lys 


Thr 


Tyr 


Ala 


Lys 


Ser 


He 


Lys 


Pro 


Ser 













770 775 



(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3307 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNES S : s ingl e 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(ix) FEATURE: 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 112 . . . 1761 

(D) OTHER INFORMATION: G0VN6 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:43: 

TAAGGCAGGA AAAAATGTTC ATTTTGATGG AAGTCTTCTT CTTCTTCCTT AACATTCCAC 
TGCTCATGGC AAATTTCATT GATCCCAAGT GCTTTTGGAG AGTAAATTTG A ATG AAG 

Met Lys 
1 

TTA AGG GAT AAA GAC TTG AGC ATA ACT TGT TCC TTC ATC CTT GAA GCA 165 
Leu Arg Asp Lys Asp Leu Ser He Thr Cys Ser Phe He Leu Glu Ala 
5 10 15 

GTT CAG ATG CCT ACG GAA AAC GAT TAT TTC AAC CAG ACT CTG AAT ATC 213 
Val Gin Met Pro Thr Glu Asn Asp Tyr Phe Asn Gin Thr Leu Asn He 
20 25 30 

CTA AAA ACA ACA AAA AAC CAC AAA TAT GCT TTG GCA TTG GCC TTT TCA 261 
Leu Lys Thr Thr Lys Asn His Lys Tyr Ala Leu Ala Leu Ala Phe Ser 
35 40 45 50 

ATT GAT GAA ATC AAC AGG AAT CCT GAT CTT TTA CCA AAT ATG TCT TTG 309 
He Asp Glu He Asn Arg Asn Pro Asp Leu Leu Pro Asn Met Ser Leu 
55 60 65 



60 
117 
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ATC ATA AAA TAC CCT TTG GGC CTT TGC GAT GGA CAA ACT ACA TTA CCT 357 
lie He Lys Tyr Pro Leu Gly Leu Cys Asp Gly Gin Thr Thr Leu Pro 
70 75 80 

ACA CCC TAT TTA TTT AAT GAA ATA TAT TTT AGG CCT ATC CCT AAT TAT 405 
Thr Pro Tyr Leu Phe Asn Glu He Tyr Phe Arg Pro He Pro Asn Tyr 
85 90 95 

TTC TGT AAT GAA GAG ACT ATG TGT ACA TTT CTA CTT ACA GGA CCG CAT 453 
Phe Cys Asn Glu Glu Thr Met Cys Thr Phe Leu Leu Thr Gly Pro His 
100 105 no 

TGG ATA ACA TCT TAT AGT TTC TGG ATA CAC TTG AAC ATC TTC TTA TCT 501 
Trp He Thr Ser Tyr Ser Phe Trp He His Leu Asn He Phe Leu Ser 
115 120 125 130 

CCT AGT ATG AAC CCA AAG GAC ACA TCC CTA GCT TTG GCA ATG GTC TCC 549 
Pro Ser Met Asn Pro Lys Asp Thr Ser Leu Ala Leu Ala Met Val Ser 
135 140 145 

TTC TTA CTT TAT TTC AAG TGG AAC TGG GTC GGC CTT GTC ATC TCA GAT 597 
Phe Leu Leu Tyr Phe Lys Trp Asn Trp Val Gly Leu Val He Ser Asp 
150 155 160 

GAT GAT CAA GGC AAT CAA TTT CTC TCT GAG TTG AAA AAA GAG AGC AAA 645 
Asp Asp Gin Gly Asn Gin Phe Leu Ser Glu Leu Lys Lys Glu Ser Lys 
165 170 175 

ATC AAG GAA ATT TGC TTT GCA TTT GTG AGC ATG CTG GCA ATC GAT GAG 693 
He Lys Glu He Cys Phe Ala Phe Val Ser Met Leu Ala He Asp Glu 
180 185 190 

ATT TCA TTT TAT CAT AAA ACT GAA ATG TAC TAC AAC CAA ATT GTG ATG 741 
He Ser Phe Tyr His Lys Thr Glu Met Tyr Tyr Asn Gin He Val Met 
195 200 205 210 

TCA TCC ACA AAC GTT ATT ATC ATT TAT GGG AAA ACA GAG AGT ATT ATT 789 
Ser Ser Thr Asn Val He He He Tyr Gly Lys Thr Glu Ser He He 
215 220 225 

GAG TTG AGC TTC AGA ATG TGG GAA TCT CCA GTT ATC CAG AGA ATA TGG 83 7 

Glu Leu Ser Phe Arg Met Trp Glu Ser Pro Val He Gin Arg He Trp 
230 235 240 

GTC ACC ACA AAA GAA ATG AAT TTC CCT ACC AGT AAG AGA GAT TTA ACT 885 
Val Thr Thr Lys Glu Met Asn Phe Pro Thr Ser Lys Arg Asp Leu Thr 
245 250 255 

CAT GAC ACA TTC TAT GGG ACT CTT ACT TTT CTA CAC AGC CAT GGG GAG 933 
His Asp Thr Phe Tyr Gly Thr Leu Thr Phe Leu His Ser His Gly Glu 
260 265 270 

ATT TCA GGC TTT AAA AAT TTT GTA CAG ACA TGG TAC CAT CTT AGA ATC 981 
He Ser Gly Phe Lys Asn Phe Val Gin Thr Trp Tyr His Leu Arg He 
275 280 285 290 

ACT GAT TTG CAT CTA GTA ATG CCA GAG TGG AAA TAT TTT AAC TAT GAA 1029 
Thr Asp Leu His Leu Val Met Pro Glu Trp Lys Tyr Phe Asn Tyr Glu 
295 300 305 

GCC TCA GCA TCT AAC TGT AAA ATA TTG AAG AAC TAT TCA TCC AGT GCC 1077 
Ala Ser Ala Ser Asn Cys Lys He Leu Lys Asn Tyr Ser Ser Ser Ala 
310 315 320 



TCA TTG GAA TGG TTA ATG GAG CAG ACA TTT GAC ATG GTC TTT AGT GAT . 1125 
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Ser 


Leu 


Glu 
325 


Trp 


Leu 


Met 


Glu 


Gin 
330 


Thr 


Phe 


Asp 


Met 


Val 
335 


Phe 


Ser 


Asp 




GGA AGT CGG 
Gly Ser Arg 
340 


GAT 
Asp 


ATA 
He 


TAT 
Tyr 


AAT 
Asn 
345 


GCT 
Ala 


GTA 
Val 


AAT 
Asn 


GCC 
Ala 


ATG 
Met 
350 


GCC 
Ala 


CAT 
His 


GCA 
Ala 


CTC 
Leu 


1173 


CAT 
His 
355 


GAG 
Glu 


ATG 
Met 


AAT 
Asn 


CTG 
Leu 


CAC 
His 
360 


CTG 
Leu 


GTT 
Val 


GAT AAT 
Asp Asn 


CAG 
Gin 
365 


GCA 
Ala 


ATA 
He 


GAC 
Asp 


AAT 
Asn 


GGG 
Gly 
370 


1221 


AAA 

Lys 


GGA GCC 
Gly Ala 


AGT 
Ser 


TCT 
Ser 
375 


CAC 
His 


TGC 
Cys 


TTT 
Phe 


AAG 
Lys 


ATA 
He 
380 


AAC 
Asn 


TCC 
Ser 


TTT 
Phe 


CTC 
Leu 


AGA 
Arg 
385 


AAG 
Lys 


1269 


ACC 
Thr 


CAC 
His 


TTC 
Phe 


ACT 
Thr 
390 


AAT 
Asn 


CCT 
Pro 


CTT GGG 
Leu Gly 


GAC 
Asp 
395 


AGA 
Arg 


GTG 
Val 


ATT 
He 


ATG 
Met 


AAA 
Lys 
400 


GAG AGA 
Glu Arg 


1317 


GAA 
Glu 


ATA 
He 


CTG 
Leu 
405 


CAA 
Gin 


GAA 
Glu 


GAC 
Asp 


TAT AAC 
Tyr Asn 
410 


ATT 

He 


TTT 

Phe 


CAC 
His 


ACT 
Thr 


TGG 
Trp 
415 


AAT 
Asn 


Phe 


TCT 
Ser 


1365 


CAG 
Gin 


CAC 
His 
420 


ATT 
He 


GGT 
Gly 


TTT 
Phe 


AAG 
Lys 


GTG 
Val 
425 


AAG 
Lys 


ATA 
He 


GGA 

Gly 


AAG 
Lys 


TTC 
Phe 
430 


AGC 
Ser 


CCA 
Pro 


TAT 
Tyr 


TTT 
Phe 


1413 


CCA CAT GGC 
Pro His Gly 
435 


AGG 
Arg 


CAC 
His 


TTT 
Phe 
440 


CAC 
His 


CTA 
Leu 


TAT GTA 
Tyr Val 


GAC 
Asp 
445 


ATG 
Met 


ATT 
He 


GAG 
Glu 


TTG 
Leu 


GCT 
Ala 
450 


1461 


ACA GGA AGT 
Thr Gly Ser 


AGA 
Arg 


AAG 
Lys 
455 


ATG 
Met 


CCA 
Pro 


TCC 
Ser 


TCT 
Ser 


GTG 
Val 
460 


TGC 
Cys 


ACT 
Thr 


GAA 
Glu 


GAT 
Asp 


TGT 
Cys 
465 


AGT 
Ser 


1509 


CCT 
Pro 


GGA TAC 
Gly Tyr 


AGA 
Arg 
470 


AGA 
Arg 


TTC 
Phe 


TGG 
Trp 


AAG 
Lys 


GAG GGA 
Glu Gly 
475 


ATG 
Met 


GCA 
Ala 


GCC 
Ala 


TGC 
Cys 
480 


TGT 
Cys 


TTT 
Phe 


1557 


GTT 
Val 


TGC 
Cys 


AGT 
Ser 
485 


CCC 
Pro 


TGC 
Cys 


CCT 
Pro 


GAA 
Glu 


AAT 
Asn 
490 


GCA 
Ala 


ATT 
He 


TCT 
Ser 


AAT 
Asn 


GAG 
Glu 
495 


ACA 
Thr 


AAT 
Asn 


ATG 
Met 


1605 


GAT 
Asp 


CAG 
Gin 
500 


TGT 
Cys 


GTG 
Val 


AAT 
Asn 


TGT 
Cys 


CCA 
Pro 
505 


GAA 
Glu 


TAC 
Tyr 


CAA 
Gin 


TAT 
Tyr 


GCC 
Ala 
510 


AAT 
Asn 


ACA 
Thr 


AAG 
Lys 


CGG 
Arg 


1653 


GAC 
Asp 
515 


AAA 
Lys 


TGC 
Cys 


ATT 
He 


CAG 
Gin 


AAA 
Lys 
520 


AAT 
Asn 


GTG 
Val 


ATG 
Met 


TTT 
Phe 


CTA 
Leu 
525 


AGC 
Ser 


TAC 
Tyr 


AAA 
Lys 


GAC 
Asp 


CCC 
Pro 
530 


1701 


CTT GGG GAT 
Leu Gly Asp 


GAC 
Asp 


TCT 
Ser 
535 


TGC 
Cys 


CTT 
Leu 


CAT 
His 


AGC 
Ser 


CTT 
Leu 
540 


CTT 
Leu 


TTT 
Phe 


CTC 
Leu 


TGC 
Cys 


ATT 
He 
545 


AAC 
Asn 


1749 



AGC TGT TGT ACT TAGGGTCTTT GTGAAGCACC ATGACACTCC TATTGTGAAG GCCAA 1806 
Ser Cys Cys Thr 
550 



TAACAGAATC CTCAGCTACC TATTAATCAC GTCTCTCTTG TTCTGTTTTC TCTGCTCATT 1866 

TTTCTTCATT GGCCATCCTA ACAGAGCAAC CTGCATCTTA CAGCAAATCA CATTTGGAAT 1926 

TGTATTCACT GTGGCTATTT CTACAATTTT GGCAAAAACA ATCACTGTGG TTCTGGCTTT 1986 

CAAAGTCACA AACCCAGGAA GAAGGTTGAG AAACTTCCTA GTATTGGGTA CACTCAACTA 2046 

CATTATC CCC ATATGTTCCC TGTTTCAATG TATTCTGTGT GCAATCTGGC TAGCAGTTTC 2106 

TCCTCCCTTT GTTGATACTG ATGAACACAC TGAGTATGGC CACATCATCA TTGTGTGCAA 2166 
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CAAAGGCTCA GTAACTGCAT TCTACTGTGT CCTGGGATAC TTGGCCTGCT TGGCACTTGC 2226 

AAGCTTCACT GTGGCTTTCT TGGCAAAGAA TCTGCCAGAC ACATTCAATG AAGCCAAGTT 2286 

CTTGACCTTC AGCATGCTGG TGTTCTGCAG TGTCTGGGTC ACCTTCCTCC CTGTCTACCA 2346 

CAGCACCAAG GGCAAGATCA TGGTTGCTGT GGAGATATTC TCCATTTTGG CATCCAGTGC 24 06 

AGGGATGCTT GGATGCATCT TTGCACCCAA GATTTACATC ATTTTAATGA GACCAGAGAG 2466 

AAATGCTATC CAAAAGATCA GGGAGAAATC ATATTTCTGA ACAAATTATT TCAGAATTTC 2526 

TATCAAATGT AAACATGGTA TATACCCATC AAATATTGTG TTACAGTGCA TGTATCTAGT 2586 

TTTAGAATCA CTCTCACTGG TACCCCTAGT GATGTCTAGA' AATATCATAT CTACCAATCT 2646 

TGAATACATT GTCCATAAAA TCTTGTACAT ATTCACTAGC TTAGTTTCCT GTGGGAGAAC 2706 

TAAAATTCTC AAATTATTAT TACAATTTTA TTCATAATTT TGCTCTCATG G CAAATCAG A 2766 

ACTCATTTTC TAATTTCCAG TAACAACACA TACATGACAG AATACTGATT TTCAGCTATT 2826 

CTTTAAGCTA TTGGCCAATA GACTAAGGTG GAAATGTTCT TTTTCTTTCT GAAACACAAA 2886 

AATATTATAT CATATAATAC ACAGAAGTCA GGGACCCCTA TGGATGAATT AGGGAATAGT 2946 

TGGAAGAAGC TGGCTGAGTA GAAGGGTGAC C CAT AGGAAG ACCAGCAGTC TCACCTAACA 3006 

AGGACAACCA AGATCTTGCT GACACTGAAT CACTTGCTAG GCAGTTGATT TGAGGCCCCT 3 066 

GACACATATC AAGCATAGGA CTACATTGGC TGGCCTCAGT GGGAGAAGAC AACCTAACCC 3126 

CCTAGAGACT TGAGGCCCCA GGCTAAGGGG AGGTTGGGGG TTTTGAAAGT TGGGGATATT 3186 

ATCTTGGAGT TGGGGAGGGG TATGGGATGA AGAAGAGTCA GGAGGCAGGT GCTGGTTGGA 3246 

GTATAATGAC TGGACTGTAA ATAAAAGACT AACAACCAAA AATAAATAAA ATAACTTAAA 3306 

A 3307 

(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 550 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 



(XX) SEQUENCE DESCRIPTION: SEQ ID NO:44: 



Met 


Lys 


Leu 


Arg 


Asp 


Lys 


Asp 


Leu 


Ser 


He 


Thr 


Cys 


Ser Phe 


He 


Leu 


1 








5 










10 








15 




Glu 


Ala 


Val 


Gin 


Met 


Pro 


Thr 


Glu 


Asn 


Asp 


Tyr 


Phe 


Asn Gin 


Thr 


Leu 








20 










25 






30 






Asn 


lie 


Leu 


Lys 


Thr 


Thr 


Lys 


Asn 


His 


Lys 


Tyr 


Ala 


Leu Ala 


Leu 


Ala 






35 










40 










45 






Phe 


Ser 


He 


Asp 


Glu 


He 


Asn 


Arg 


Asn 


Pro 


Asp 


Leu 


Leu Pro 


Asn 


Met 




50 










55 










60 








Ser 


Leu 


He 


He 


Lys 


Tyr 


Pro 


Leu 


Gly 


Leu 


Cys 


Asp Gly Gin Thr 


Thr 


65 










70 










75 








80 


Leu 


Pro 


Thr 


Pro 


Tyr 


Leu 


Phe 


Asn 


Glu 


He 


Tyr 


Phe 


Arg Pro 


He 


Pro 










85 










90 








95 




Asn 


Tyr 


Phe 


Cys 


Asn 


Glu 


Glu 


Thr 


Met 


Cys 


Thr 


Phe 


Leu Leu Thr Gly 








100 










105 








110 






Pro 


His 


Trp 


He 


Thr 


Ser 


Tyr 


Ser 


Phe 


Trp 


He 


His 


Leu Asn 


He 


Phe 






115 










120 










125 






Leu 


Ser 


Pro 


Ser 


Met 


Asn 


Pro 


Lys 


Asp 


Thr 


Ser 


Leu 


Ala Leu 


Ala 


Met 




130 










135 










140 








val 


Ser 


Phe 


Leu 


Leu 


Tyr 


Phe 


Lys 


Trp 


Asn 


Trp 


Val 


Gly Leu Val 


He 


145 










150 










155 








160 


Ser 


Asp 


Asp 


Asp 


Gin 


Gly 


Asn 


Gin 


Phe 


Leu 


Ser 


Glu 


Leu Lys 


Lys 


Glu 










165 










170 








175 




Ser 


Lys 


He 


Lys 


Glu 


He 


Cys 


Phe 


Ala 


Phe 


Val 


Ser 


Met Leu 


Ala 


He 








180 










185 








190 






Asp 


Glu 


He 


Ser 


Phe 


Tyr 


His 


Lys 


Thr 


Glu 


Met 


Tyr 


Tyr Asn 


Gin 


He 






195 










200 










205 






Val 


Met 


Ser 


Ser 


Thr 


Asn 


Val 


He 


He 


He 


Tyr 


Gly Lys Thr 


Glu 


Ser 




210 










215 










220 








He 


He 


Glu 


Leu 


Ser 


Phe 


Arg 


Met 


Trp 


Glu 


Ser 


Pro 


Val He 


Gin 


Arg 


225 










230 










235 








240 


He 


Trp 


Val 


Thr 


Thr 


Lys 


Glu 


Met 


Asn 


Phe 


Pro 


Thr 


Ser Lys 


Arg 


Asp 
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250 








255 




Leu 


Thr 


XIX E 


Asp inr 


13 Via 
XT 116 


Tyr 


Giy 


Thr 


Leu 


i nr pne 


Leu 


His 


Ser 


His 
















o c c 
2o5 








270 






Gly Glu 


116 


Ser Gly 


•trie 


Lys 


Asn 


fne 


vai 


Gin Tnr 


Trp 


Tyr 


His 


Leu 






275 








280 








285 








Arg 


He 




TV T it i * 

ASp iicli 


xilS 


Leu 


vai 


Met 


Pro 


Glu Trp 


Lys 


Tyr 


Phe 


Asn 




290 








2 95 








300 








Tyr 


Glu 


Aia 


be! Ala 


Ser 


Asn 


Cys 


Lys 


He 


Leu Lys 


Asn 


Tyr 


Ser 


Ser 


305 








310 










315 






320 


Ser 


Ala 


Ser 


Leu Glu 


Trp 


Leu 


Met 


G1U 


Gin 


Thr Phe 


Asp 


Met 


Val 


Phe 








325 










330 






335 




Ser Asp 


Giy 


Ser Arg 


ASp 


lie 


Tyr 


Asn 


Ala 


Val Asn 


Ala 


Met 


Ala 


His 








340 








345 








350 






Ala 


Leu 


HIS 


Glu Met 


Asn 


Leu 


HIS 


Leu 


Val 


Asp Asn 


Gin 


Ala 


He 


Asp 






355 








360 








365 






Asn 


Gly 


T i m 

jjys 


Gly Ala 


ser 


Ser 


TT ' 

HIS 


Cys 


Phe 


Lys He 


Asn 


Ser 


Phe 


Leu 




370 








375 








380 










Arg 


Lys 


inr 


ill s rne 


inr 


Asn 


Pro 


Leu 


Gly Asp Arg 


Val 


He 


Met 


Lys 


385 








"5 Q 










395 








400 


Glu 


— ^ 


f51 1 1 


lie Leu 


f3T n 


— — ~ 




Tyr 


Asn 


He Phe 


His 


Thr 


Trp 


Asn 








>i rt c 
405 










410 








415 




Phe 


Ser 


uin 


rix s lie 


GXy 


pne 


Lys 


vai 


Lys 


He Gly 


Lys 


Phe 


Ser 


Pro 








420 








425 




430 






Tyr 


Phe 


Pro 


HIS Gly 


Arg 


TT J 

HIS 


Pne 


His 


Leu 


Tyr Val 


Asp 


Met 


He 


Glu 






435 








440 








445 








Leu 


Ala 


Thr 


Gly Ser 


Arg 


Lys 


Met 


Pro 


Ser 


Ser Val 


Cys 


Thr 


Glu 


Asp 




450 








455 








460 






Cys 


Ser 


Pro 


Gly Tyr 


Arg 


Arg 


Phe 


Trp 


Lys 


Glu Gly 


Met 


Ala 


Ala 


Cys 


465 








470 










475 








480 


Cys 


Phe 


Val 


Cys Ser 


Pro 


Cys 


Pro 


Glu 


Asn 


Ala He 


Ser 


Asn 


Glu 


Thr 








485 










490 








495 




Asn 


Met 


Asp 


Gin Cys 


Val 


Asn 


Cys 


Pro 


Glu 


Tyr Gin 


Tyr 


Ala 


Asn 


Thr 








.500 








505 








510 






Lys 


Arg 


Asp 


Lys Cys 


He 


Gin 


Lys 


Asn 


Val 


Met Phe 


Leu 


Ser 


Tyr 


Lys 






515 








520 








525 




Asp 


Pro 


Leu 


Gly Asp 


Asp 


Ser 


Cys 


Leu 


His 


Ser Leu 


Leu 


Phe 


Leu 


Cys 




530 








535 








540 








He 


Asn 


Ser 


Cys Cys 


Thr 




















545 








550 





















(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3938 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
( ix) FEATURE : 

(A) NAME/ KEY: Coding Sequence 

(B) LOCATION: 46... 2424 

(D) OTHER INFORMATION: GOVN7 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO:45: 

CGGCACGAGC CCAGGTTTAA GGCTGGAAAA AATATGTTCA TTTTG ATG ATA GTA TTC 57 

Met He Val Phe 
1 

TTT CTC CTC AAC ATT CCA CTT CTC ATG GCA AAT TCC GTT GAT CCC AGG 105 
Phe Leu Leu Asn He Pro Leu Leu Met Ala Asn Ser Val Asp Pro Arg 
5 10 15 20 
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TGC TTT TGG AAA ATA AAT TTG AAT GAA GTC AAG GAT ATA GAT TTA GAT 153 
Cys Phe Trp Lys lie Asn Leu Asn Glu Val Lys Asp lie Asp Leu Asp 
25 30 35 

ACA AGT TGT TAC TTC ATC CTT GAG GCA GTT CAG TTG CCT ATG GAG AAA 201 
Thr Ser Cys Tyr Phe lie Leu Glu Ala Val Gin Leu Pro Met Glu Lys 
40 45 50 

GAT TAT TTC AAC CAG ACT CTG AAT GTC CTA AAA ACA ACC AAA TAC AAC 249 

Asp Tyr Phe Asn Gin Thr Leu Asn Val Leu Lys Thr Thr Lys Tyr Asn 
55 60 65 

AGA TAT GCA TTG GCA TTA GCC TTT ACA ATG GAT GAA ATA AAC AGG AAT 297 

Arg Tyr Ala Leu Ala Leu Ala Phe Thr Met Asp Glu lie Asn Arg Asn 
70 75 80 

CCT CAT ATT TTA CCA AAC ATG TCT TTG ATT ATA AAA CAT ACA TTG GGC 345 

Pro His lie Leu Pro Asn Met Ser Leu lie lie Lys His Thr Leu Gly 

85 90 95 100 

CAC TGT GAT GGA AAT ATC CCA CTC CGC TTA CTT AAT GAA ATA TTT TAT 393 

His Cys Asp Gly Asn lie Pro Leu Arg Leu Leu Asn Gin He Phe Tyr 
105 110 us 

ATG CCT TTT CCT AAT TAT GGC TGT AAT GAA GAG ACT ATG TGT TCA TTT 441 

Met Pro Phe Pro Asn Tyr Gly Cys Asn Glu Glu Thr Met Cys Ser Phe 
120 125 130 

ATG CTT ATG GGA CCG AAT TTG TGG CCA TCT GTA GAT TTT TTC ATT CAC 489 

Met Leu Met Gly Pro Asn Leu Trp Pro Ser Val Asp Phe Phe He His 
135 140 145 

TTG AAC ATC TTA TTT CCT CAT TTC CTT CAG ATT TCC TTC GGA CCT TTC 537 

Leu Asn He Leu Phe Pro His Phe Leu Gin He Ser Phe Gly Pro Phe 
150 155 160 

CAT TCC ATT TTC AGT GAT AAT GAA GAA TTT CCT TAT ATC TAT CAG ATG 585 

His Ser He Phe Ser Asp Asn Glu Gin Phe Pro Tyr He Tyr Gin Met 

165 170 175 180 

ACC CCA AAG GAT ACA TCA CTA GCA TTG GCA ATG GTC TCT TTC ATA CTT 633 

Thr Pro Lys Asp Thr Ser Leu Ala Leu Ala Met Val Ser Phe He Leu 
185 190 195 

TAC TTC AAC TGG AAC TGG GTT GGT CTT GTC CTC TCA GAT AAT GAT GAA 681 

Tyr Phe Asn Trp Asn Trp Val Gly Leu Val Leu Ser Asp Asn Asp Glu 
200 205 210 

GGC AAT CAA TTT CTC ACA GAG TTG AAA AAA GAG ACC CAC AAC ACG GAA 729 

Gly Asn Gin Phe Leu Thr Glu Leu Lys Lys Glu Thr His Asn Thr Glu 
215 220 225 

ATA TGC TTT GCC TTT GTG AAC ATG ATG GCA ATC AAT GAG AAT TCA TCC 777 

He Cys Phe Ala Phe Val Asn Met Met Ala He Asn Glu Asn Ser Ser 
230 235 240 

ATG AAA AAA ACT GAC ATG TAC TAC AAC CAA ATT GTG ATG TCA ACC GCA 825 

Met Lys Lys Thr Asp Met Tyr Tyr Asn Gin He Val Met Ser Thr Ala 

245 250 255 260 

AAT GTT ATT ATC ATT TAT GGG GAA CGA CCC AGT ATT ATT GAA CTG TGT 873 

Asn Val He He He Tyr Gly Glu Arg Pro Ser He He Glu Leu Cys 
265 270 275 



TTC AGA ACA TGG ACA TCT CCA GTC ATA CAG AGG ATA TGG GTT ACC AAA 



921 
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Phe Arg Thr Trp Thr Ser Pro Val He Gin Arg He Trp Val Thr Lys 
280 285 290 

TCA GAG TTG TAT TTC CCA ACA AGT AAG AGA GAC TTA AGT CAT GGA ACA 969 
Ser Glu Leu Tyr Phe Pro Thr Ser Lys Arg Asp Leu Ser His Gly Thr 
295 300 305 

TTC TAT GGA ACT CTA GCA TTT CAA CAA CAC CAT GAT GTG ATT TCT GGA 1017 
Phe Tyr Gly Thr Leu Ala Phe Gin Gin His His Asp Val He Ser Gly 
310 315 320 

TTT AAA AAT TTT GTA CAG ACA TGG TAC CAT CTC AAA AGC ATG GAT TTA 1065 
Phe Lys Asn Phe Val Gin Thr Trp Tyr His Leu Lys Ser Met Asp Leu 
325 330 335 340 

TAT TTA TTA AAG CCA GAG TGG GGT TTC TTT GAA TAT GAA ACC TCA GCA 1113 
Tyr Leu Leu Lys Pro Glu Trp Gly Phe Phe Glu Tyr Glu Thr Ser Ala 
345 350 355 

TCT TAC TGT AAA ATA CTG ATG AGT AAT TCA TCG AAT GTC TCA TTG GAA 1161 
Ser Tyr Cys Lys He Leu Met Ser Asn Ser Ser Asn Val Ser Leu Glu 
360 365 370 

TGG CTA ATG GAA CAG AAG TTT GAC ATA GCC TTT AAT GAC AAT AGT CAT 1209 
Trp Leu Met Glu Gin Lys Phe Asp He Ala Phe Asn Asp Asn Ser His 
375 380 385 

AGT ATA TAC AAT GCT GTG TAC GCC ATG GCC CAT GCT CTC CAT GAA AAG 1257 
Ser He Tyr Asn Ala Val Tyr Ala Met Ala His Ala Leu His Glu Lys 
390 395 400 

AAT CTG AAA CAA ATT GAT AAT CAG GAA ATC AGC TAT GGC AAA GGA GCA 13 05 
Asn Leu Lys Gin He Asp Asn Gin Glu He Ser Tyr Gly Lys Gly Ala 
405 410 415 420 

AGT ACT CAC TGC TTG AAG TTA CAC TCA TTT TTG AGA ACG ATC CAC TTC 1353 
Ser Thr His Cys Leu Lys Leu His Ser Phe Leu Arg Thr He His Phe 
425 430 435 



ACC AAT CCT TTT GGG GAG AGA GTG ATT ATG AAA GAG AGA GTA AGA GTG 1401 
Thr Asn Pro Phe Gly Glu Arg Val He Met Lys Glu Arg Val Arg Val 
440 445 450 

CAG GAA GAC TAT GAC ATT GTT CAC CTG CAG AAC TGC TCA CAA CAC CTT 1449 
Gin Glu Asp Tyr Asp He Val His Leu Gin Asn Cys Ser Gin His Leu 
455 460 465 



AGG ATT AAG GTG AAG ATA GGG CAG TTC AGC CCA TAT TTT CCA CAT GGT 1497 
Arg He Lys Val Lys He Gly Gin Phe Ser Pro Tyr Phe Pro His Gly 
470 475 480 

GGA CAA TTT CAC TTA TAT GAA GAC ATG ATT GAT TTG GCC ACA GGA AGT 1545 
Gly Gin Phe His Leu Tyr Glu Asp Met He Asp Leu Ala Thr Gly Ser 
485 490 495 500 

AGA AAG ATG CCT TTA TCT ATG TGT AGT GCA GAT TGT CGT CCT GGA TAC 1593 
Arg Lys Met Pro Leu Ser Met Cys Ser Ala Asp Cys Arg Pro Gly Tyr 
505 510 515 

AGA AAA TTC TGG AAG GAG GGA ATG GCA GCC TGC TGT TTT GTT TGC AGT 1641 
Arg Lys Phe Trp Lys Glu Gly Met Ala Ala Cys Cys Phe Val Cys Ser 
520 525 530 

CCC TGT CCA GAC AAT GAA ATT TCT AAT GAA ACA ACT GTG GTA CTT TGG 168 9 
Pro Cys Pro Asp Asn Glu He Ser Asn Glu Thr Thr Val Val Leu Trp . 
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535 540 545 

GTC TTT GTG AAG CAC CAT GAC ACT CCT ATT GTG AAG GCC AAT AAC AGA 1737 
Val Phe Val Lys His His Asp Thr Pro lie Val Lys Ala Asn Asn Arg 
550 555 560 

ATC CTC AGC TAC ATA TTA ATC ATG TCA CTC ATG TTC TGC TTT CTG TGC 1785 
lie Leu Ser Tyr lie Leu lie Met Ser Leu Met Phe Cye Phe Leu Cys 
565 570 575 580 

TCC TTT TTC TTC ATT GGC CAT CCT AAC AGA GGT ACC TGT ATC TTA CAG 1833 
Ser Phe Phe Phe lie Gly His Pro Asn Arg Gly Thr Cys lie Leu Gin 
585 590 595 

CAA ATC ACA TTT GGA ATT GTA TTC ACT GTG GCT GTT TCC ACA GTT CTG 1881 
Gin lie Thr Phe Gly He Val Phe Thr Val Ala Val Ser Thr Val Leu 
600 605 610 

GCC AAA ACA ATC ACT GTG CTT CTG GCT TTT CAA GTC ACA GAC ACA GGA 1929 
Ala Lys Thr He Thr Val Leu Leu Ala Phe Gin Val Thr Asp Thr Gly 
615 620 625 

AGA AAG TTA AGA AAC TTC CTG GTA TCG GGG ACA CCC AAC TAC ATT ATT 1977 
Arg Lys Leu Arg Asn Phe Leu Val Ser Gly Thr Pro Asn Tyr He He 
630 635 640 

CCC ATA TGT TCC CTG TTG CAA TGC ACT CTG TGT GCA ATT TGG CTA GCA 2025 
Pro He Cys Ser Leu Leu Gin Cys Thr Leu Cys Ala He Trp Leu Ala 
645 650 655 660 

GTT TCT CCA CCA TTT GTT GAT ATC GAT GAA CAT TCT GAG CAT GGT CAC 2073 
Val Ser Pro Pro Phe Val Asp He Asp Glu His Ser Glu His Gly His 
665 670 675 

ATC ATA ATT GTG TGC AAC AAG GGA TCT GTT ATG GCA TTC TAC TGT GTC 2121 
He He He Val Cys Asn Lys Gly Ser Val Met Ala Phe Tyr Cys Val 
680 685 690 

CTG GGA TAT TTG GCC TTC CTG GCC CTT GGA AGT TTC ACG ATG GCT TTC 2169 
Leu Gly Tyr Leu Ala Phe Leu Ala Leu Gly Ser Phe Thr Met Ala Phe 
695 700 705 

TTG GCA AAG AAT CTG CCT GAC ACA TTC AAT GAA GCC AAG TTC TTG ACC 2217 
Leu Ala Lys Asn Leu Pro Asp Thr Phe Asn Glu Ala Lys Phe Leu Thr 
710 715 720 

TTC AGC ATG CTA GTG TTC TGC AGT GTC TGG ATC ACG TTC CTT CCT GTC 2265 
Phe Ser Met Leu Val Phe Cys Ser Val Trp He Thr Phe Leu Pro Val 
725 730 735 740 

TAC CAT AGC ACC AAG GGC AGA GTC ATG GTT GCT GTT GAA ATT TTC TCC 2313 
Tyr His Ser Thr Lys Gly Arg Val Met Val Ala Val Glu He Phe Ser 
745 750 755 

ATT TTG ACA TCC AGT GCA GGG ATG CTT GGA TGC GTC TTT GCA CCC AAA 2361 
He Leu Thr Ser Ser Ala Gly Met Leu Gly Cys Val Phe Ala Pro Lys 
760 765 770 

ATT TAC ATC ATT TTA ATG AAA CCA GAG AGA ATT CTA TCC AAA AGA CAG 2409 
He Tyr He He Leu Met Lys Pro Glu Arg He Leu Ser Lys Arg Gin 
775 780 785 

GAG AAA TCA CGT TTC TAAACAGATA TTTTAGAAAT TCTGTCAAAT GTACAGTTGT T 2465 
Glu Lys Ser Arg Phe 
790 
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ATATACCCAC CAAATATTTG GTTACAGTGC ATAAATCTAG TTTTAGAACT CTCACTAGTT 2525 

CCTCTAATGA TATCTAGAAA TATTGTATCT ACCAATCTTA CATTCATTAT CCATAAAATC 2585 

CTGCACTCAT TCACTTGTTT GTTCTACTCT GTGAGAAATA TAATTCCCAA TGTAGTATTA 2645 

AATTTTTTCT AAAAATTTTG CTTTAATTGA CATTTTTTCC CTTATAACTT CAAGTACATT 2705 

TGATAAGGCA TTTGAATCTA TAACCTTTTA TACAATAAGA TCCAGGACAG ACAGGATTAC 2765 

ACATAGAAAC CGTCTATCGA ATCAAACAAT CAATCAGACT AAAAAACAAA GAATCAACAA 2825 

AGATAACATC AGAATACATT ATCTGATTTC CAGTAGAAGC AC AT ATG TG A CAGAATACTG 2885 

TCTGTTTTTA TAGTTCCTCT TCAAGCTATT GTATTGGTCA GCAGTCTAAG GTAGAAGTTT 2945 

TTTTGTCACA AACACAAAAA TATTGTATCC AACAATGGAC AGAATCCAGT GAGCACCCTG 3 005 

TTCAAATTTG GAGATAGTTG GAATATCATG AAAAAGAGGG TGACCCATAA GAATACCAGC 3 065 

ATTCTCAACT AACCTGGACA ACCACGAATT TGAGCTGCTG ACCAGGCAGC ATACATAAGC 3125 

TGATATGAGG CTCCCAGCAC AG ATG CAACA TAGGGCTGCC TGGTCTGGCC TCAGTGGAAG 3185 

AAGACACATT TAAACCACAA GAGACAGGAG TCACAAGGGA TTGGGAAGGT GTGATGGTTT 3245 

GCATATGCTT GGCTCAGGAA GTGGCACTAT TAGAAGGTGT AGACTTGATG GAGGAATTTG 3305 

TCACTGTAGG GGTGGGCTTG GAGATCCACC TCATAGCTGC CTGGGGATGC TCAGTCTGTT 33 65 

CCTGGCTTCC TTCAGGTGAA GATATAGAAC TCAGATCCTC CTTCACCAAG CCTGCCTGGA 3425 

TGCTGTGATG CTQCCATGCT CCGACCTTGA TGATAATGGA CTGAACCTCT GAACATGTAA 34 85 

GCTGGCTCCA ATTAAAGGTT GTCCTTTATA AAACTTCCAT TGATCACAGT GTCTGTACAT 3545 

AGCAATAAGA CCCAAACTAA GACAGAAGGT GTGTGGATTG GGGAAGTGGG GATTTCCTCT 3605 

TGGAGGTGGG GAAGTAGTCA AAGATTAAAT TGGGAAGGGG ATAATGAGTA CACCGTAAAA 3665 

AGTATTAAAG AATAAAATAC TAAAAAATTA ATTAAATAGG ATTGTGAATA TATTAACATG 3725 

CTATTATATT ATAGTTCTGG AAGGGATAGG TAAAACTCCT GATGGTGGTT TGTACCTAAT 3785 

TTTTCTTAGA GCTTGCCCTT TGTATTCAGT TGTGATTGAA ATCCTGGGCT CACAAAATTC 3845 

TAGTACTATG GATATGGAGG CAGATACTTT GATTACGCTG CTTCCTAGAA ATAAATTTTC 3 905 

CAAAAACCAA AAAAAAAAAA AAAAAAAAAA AAA 3938 

(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 793 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:46: 



Met 


He 


Val 


Phe 


Phe 


Leu 


Leu 


Asn 


He 


Pro 


Leu 


Leu 


Met 


Ala 


Asn 


Ser 


1 








5 










10 










15 




Val 


Asp 


Pro 


Arg 


Cys 


Phe 


Trp 


Lys 


He 


Asn 


Leu 


Asn 


Glu 


val 


Lys 


Asp 








20 










25 










30 


He 


Asp 


Leu 
35 


Asp 


Thr 


Ser 


Cys 


Tyr 
40 


Phe 


He 


Leu 


Glu 


Ala 
45 


Val 


Gin 


Leu 


Pro 


Met 


Glu 


Lys 


Asp 


Tyr 


Phe 


Asn 


Gin 


Thr 


Leu 


Asn 


Val 


Leu 


Lys 


Thr 




50 










55 










60 








Thr 


Lys 


Tyr 


Asn 


Arg 


Tyr 


Ala 


Leu 


Ala 


Leu 


Ala 


Phe 


Thr 


Met 


Asp 


Glu 


65 










70 










75 








80 


He 


Asn 


Arg 


Asn 


Pro 


His 


He 


Leu 


Pro 


Asn 


Met 


Ser 


Leu 


He 


He 


Lys 


His 








85 










90 










95 


Thr 


Leu 


Gly 


His 


Cys 


Asp 


Gly 


Asn 


He 


Pro 


Leu 


Arg 


Leu 


Leu 


Asn 


Gin 






100 










105 








110 






He 


Phe 


Tyr 


Met 


Pro 


Phe 


Pro 


Asn 


Tyr 


Gly 


Cys 


Asn 


Glu 


Glu 


Thr 






115 










120 








125 








Met 


Cys 


Ser 


Phe 


Met 


Leu 


Met 


Gly 


Pro 


Asn 


Leu 


Trp 


Pro 


Ser 


Val 


Asp 




130 










135 










140 








Phe 


Phe 


He 


His 


Leu 


Asn 


He 


Leu 


Phe 


Pro 


His 


Phe 


Leu 


Gin 


He 


Ser 


145 










150 










155 










160 


Phe 


Gly 


Pro 


Phe 


His 


Ser 


He 


Phe 


Ser 


Asp 


Asn 


Glu 


Gin 


Phe 


Pro 


Tyr 










165 










170 










175 


He 


Tyr 


Gin 


Met 
180 


Thr 


Pro 


Lys 


Asp 


Thr 
185 


Ser 


Leu 


Ala 


Leu 


Ala 
190 


Met 


Val 


Ser 


Phe 


He 


Leu 


Tyr 


Phe 


Asn 


Trp 


Asn 


Trp 


Val 


Gly 


Leu 


Val 


Leu 


Ser 



195 200 205 



wo 
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Asp 


Asn 


Asp 


Glu 




ft 






His 


Asn 


Thr 


Glu 


225 








Glu 


Asn 


Ser 


Ser 


Met 


Ser 


Thr 


Ala 








260 


Tip 
11C5 


ux u 


lieu 


uys 










lip 


v ax 


TVt-r 
iin 






O CJft 






Ser 


His 


Gly 


Thr 


305 








Val 


He 


Ser 


Gly 


ser 




Asp 


Leu 










Glu 


Thr 


Ser 


Ala 






ICC 

J 3D 




val 


ser 


Leu 


wl U 










Asp 


Asn 


Ser 


His 


385 








LeU 


nio 


wxu 


Lys 


Glv 


Lys 


Glv 

vjxy 


Ala 








ft 


Thr 




Hie 

£11 D 


Phe 










**iy 


Val 


Arg 


Val 




/ten 






Ser 


Gin 


His 


Leu 


465 








Phe 


Pro 


His 


Glv 


Ala 


Thr 


Glv 


Ser 










Arg 


Pro 


vaiy 


Tyr 






CI c 




Phe 


Val 


Cys 


Ser 




dft 






Val 


Val 


Leu 


Trp 


545 








Ala 


Asn 


Asn 




fS/n 
v-y» 


Phe 


Leu 


Cys 








sou 


*-y» 


He 


Leu 


Gin 






3 73 




Ser 




V ell 


Leu 




fZI ft 
D X V 






Thr 


Asd 


Thr 


Glv 


625 








Asn 


Tyr 


He 


He 


He 


Trp 


Leu 


Ala 








660 


Glu 


His 


Gly 


His 






675 




Phe 


Tyr 


Cys 


Val 




690 






Thr 


Met 


Ala 


Phe 


705 








Lys 


Phe 


Leu 


Thr 



Glv 


Asn 


Gin 


Phe 






«ij 




He 


w u 


Phe 


Ala 




4* J \j 






Met 


Lys 


Lys 


Thr 


245 








Asn 


Val 


He 


He 


Phe 




Thr 


Trp 










Ser 


Glu 


Leu 


A T r 












Tyr 


ro 

«iy 


inr 










Phe 


Lys 


Asn 


Phe 


325 








Tyr 


Leu 


Leu 


Lys 


Ser 


Tvr 

xyx 


Cys 


Lys 












Leu 


Met 


Glu 






1 *7 C 
J /3 




Ser 


Tl p> 
11c 


Tyr 


Asn 




ion 






Asn 


Leu 


Lys 


Gin 


405 








Ser 


Thr 


His 


PVR 

wyb 


Thr 


Asn 


Pro 


Phe 








AAA 


Gin 


Glu 


ASp 


Tyr 






433 






He 


Lys 


Val 










Gly 


Gin 


Phe 


His 


485 










Lys 


Met 


Pro 




Lys 


Phe 


Trp 










Pro 


Cys 


Pro 


Asp 






3 0 3 




Val 
VAX 


<rXlt= 


Val 
Val 


Lys 




ccn 
3 31/ 






He 


Leu 


Ser 


Tyr 


565 








Ser 




jrXlcS 


irne 


Gin 


He 


Thr 


Phe 








OUU 


Ala 


uyo 


TViy* 


Tl p» 
11c 






O 19 




Arg 


Lys 


Leu 


Arg 




630 






Pro 


He 


Cys 


Ser 


645 








Val 


Ser 


Pro 


Pro 


He 


He 


He 


Val 








680 


Leu 


Gly 


Tyr 


Leu 






695 




Leu 


Ala 


Lys 


Asn 




710 






Phe 


Ser 


Met 


Leu 



Leu 


Thr 


Glu 










00 ft 


Phe 


Val 


Asn 


Met 










Asp 


Met 




1 y £ 










He 


Tyr 


Gly 


Glu 


265 






Thr 


Ser 


Pro 


Val 


Phe 


pro 




0C1 








ftft 


Leu 


fil a 




vain 






Jl3 




Val 
veil 


V?1X1 


1X11 


Trp 




^ 1 ft 






Pro 


Glu 


Trp 


Gly 


345 








He 


Leu 


Met 




Gin 


Lys 


Phe 


ASp 










Ala 


Val 


lyr 


Al a 
Ala 










Tip 

lie 


ASp 


Asn 


uin 




A1 ft 
41U 






Leu 


Lys 


Leu 


His 


425 








vjiy 


wl U 


Arg 


vai 


Asp 


He 


Val 
val 


nio 








A C ft 


Lys 


Tl ^ 
lie 


laiy 


vjin 






ATE 

IB / 3 






iyr 


uiu 


Asp 




*n. y \J 






Leu 


Ser 


Met 


Cys 


505 








Lys 




laxy 


rlct 


Ban 
noil 


Glu 

wXLL 


11c 


Car 










Hi s 


flXB 


ASp 


inr 






ccc 
333 




He 


Leu 


He 






3 / U 






He 


Gly 


His 


Pro 


585 








uiy 


Tip 

11c 


Val 
val 


XrlltS 


Thr 


Val 


Leu 


Leu 








Oft 


Asn 


Phe 


Leu 


Val 






635 




Leu 


Leu 


Gin 


Cys 




"650 






Phe 


Val 


Asp 


He 


665 








Cys 


Asn 


Lys 


Gly 


Ala 


Phe 


Leu 


Ala 








700 


Leu 


Pro 


Asp 


Thr 






715 




Val 


Phe 


Cys 


Ser 



uy 0 


Lys 


ulu 


inr 


Met 


Ala 


Tl 0 
ixe 


Asn 








240 


Asn 


en n 

vain 


Tin 

lie 


vai 






ore 
233 




Arg 


Pro 


Ser 


xie 




2 70 






He 


Gin 


Arg 


He 


285 








Lys 


TV 

Arg 


TV an 

ASp 


Leu 


vain 


his 


tilS 


TV a** 

ASp 








320 


Tyr 


XllS 


Leu 


Lys 






335 




Phe 


Pne 


Glu 


Tyr 




350 






Asn 


Ser 


Ser 


Asn 


365 








XT a 

lie 


Ala 


TVl« n 

Pne 


Asn 


net 


TV 1 a 

Ala 


XllS 


TV 1 a 

Ala 








400 


ulU 


lie 


Ser 


Tyr 






415 




Ser 


DVia 

irne 


Leu 


Arg 




430 






He 


Met 


Lys 


Glu 


445 








Leu 


bill 


TV r-T^ 

Asn 


Cys 


irne 


ser 


Fro 


Tyr 








480 


Mar* 

net 


lie 


Asp 


Leu 






495 




be! 


TV T a 

Ala 


Asp 


Cys 




310 






Ala 


Ala 


Cys 


Cys 


525 








Asn 


ulU 


inr 


inr 


pro 


lie 


va± 


Lys 








3b0 


C A V 
OS! 


Leu 


pie u 


irne 






575 




Asn 


Arg 


Gly 


Thr 




con 
370 






Thr 


Val 


Ala 


Val 


605 








Ala 


tjV>a 

irne 


lain 


vai 


Ser 


Glv 

uxy 


Thr 


Pro 








640 


Thr 


Leu 


Cys 


Ala 






655 




Asp 


Glu 


His 


Ser 




670 






Ser 


Val 


Met 


Ala 


685 








Leu 


Gly 


Ser 


Phe 


Phe 


Asn 


Glu 


Ala 








720 


Val 


Trp 


He 


Thr 
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725 










730 




735 




Phe 


Leu 


Pro Val Tyr 


His 


Ser 


Thr 


Lys 


Gly Arg Val 


Met 


Val Ala 


Val 






740 








745 






750 




Glu 


lie 


Phe Ser lie 


Leu 


Thr 


Ser 


Ser 


Ala Gly Met 


Leu 


Gly Cys 


Val 






755 






760 






765 




Phe 


Ala 


Pro Lys lie 


Tyr 


lie 


lie 


Leu 


Met Lys Pro 


Glu 


Arg lie 


Leu 




770 






775 






780 






Ser 


Lys 


Arg Gin Glu 


Lys 


Ser 


Arg 


Phe 










785 






790 

















(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3359 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
(ix) FEATURE : 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 59. . .2452 

(D) OTHER INFORMATION: GoVN13C 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:47: 

CGGCACGAGC ACAGTCCACT CTGTCAGGGT TTAAGGCAGG AAAAACATGC TCATTTTG AT 60 

Met 
1 

GGT AAT ATT CTT CCT TCT CAA CAT TCC ATT TCT CCT GGC AAA TTT CAT 108 
Val lie Phe Phe Leu Leu Asn lie Pro Phe Leu Leu Ala Asn Phe Met 
5 10 15 

GGA TCC CAG ATG CTT TTG GAA AAT AAA TTT GAA TGA AAT CAA GGA TGA 156 
Asp Pro Arg Cys Phe Trp Lys lie Asn Leu Asn Glu lie Lys Asp Glu 
20 25 30 

AGT CCT TGG GAT GAC TTG TTC CTT CAT CCT TGA AAC AGT TCA GAA GAC 204 
Val Leu Gly Met Thr Cys Ser Phe lie Leu Glu Thr Val Gin Lys Thr 
35 40 45 

TAT GGA CAA AGA TTA TTT CAA CCA GAC TCT GAA TGT CCT AAA TAC AAC 252 
Met Asp Lys Asp Tyr Phe Asn Gin Thr Leu Asn Val Leu Asn Thr Thr 
50 55 60 65 

TAC AAA CCA CAA ATA TGC CTT GGC ATT GGC CTT TAC AGT GGA TGA AAT 300 
Thr Asn His Lys Tyr Ala Leu Ala Leu Ala Phe Thr Val Asp Glu lie 
70 75 80 

CAA CAG GAA TCC TGA TCT TTT ACC AAA TAT GTC TCT GAT TAT AAA ATA 34 8 

Asn Arg Asn Pro Asp Leu Leu Pro Asn Met Ser Leu lie lie Lys Tyr 
85 90 95 

CAA TTT GGG TCA TTG TGA TGG AAA AAC TGT AAC AAC TCT ATC CGA TTT 396 
Asn Leu Gly His Cys Asp Gly Lys Thr Val Thr Thr Leu Ser Asp Leu 
100 105 HO 

ATT TAA TCC AAA TAA TCA TCT CCA TTT CCC CAA TTA TTT ATG TAA TGA 444 
Phe Asn Pro Asn Asn His Leu His Phe Pro Asn Tyr Leu Cys Asn Glu 
115 120 125 



AGG GAT TAT GTG TTT GGT TCT GCT TAC AGG ACC ACA TTG GAG AGC ATC . 492 
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Gly lie Met Cys Leu Val Leu Leu Thr Gly Pro His Trp Arg Ala Ser 
130 135 140 14 

TTT ATA TCT CTG GAT ATC CGT GTA TGT CTA CCT GTC TCC ACA TTT CCT 540 
Leu Tyr Leu Trp lie Ser Val Tyr Val Tyr Leu Ser Pro His Phe Leu 
5 150 155 160 

TCA GCT TTC CTA TGG ACC TTT CTA CTC CAT CTT CAG TGA TAA TGA ACA 588 
Gin Leu Ser Tyr Gly Pro Phe Tyr Ser lie Phe Ser Asp Asn Glu Gin 
165 170 175 

ATA TCC TTA TCT CTA TCA GAT GGG CCC AAA GGA CTC ATC ACT AGC ATT 636 
Tyr Pro Tyr Leu Tyr Gin Met Gly Pro Lys Asp Ser Ser Leu Ala Leu 
180 185 190 

GGC AAT GGT CTC CTT CAT AAT TTA CTT CAA GTG GAA CTG GGT TGG GCT 684 
Ala Met Val Ser Phe He He Tyr Phe Lys Trp Asn Trp Val Gly Leu 
195 200 205 

ATT TAT CTC AGA TGA TGA TCA AGG CAA TCA ATT TCT CTC AGA GTT GAA 732 
Phe He Ser Asp Asp Asp Gin Gly Asn Gin Phe Leu Ser Glu Leu Lys 
210 215 220 22 

AAA AGA GAG CCA AAC CAA GGA TAT TTG CTT TGC CTT TGT GAA CAT GAT 780 
Lys Glu Ser Gin Thr Lys Asp He Cys Phe Ala Phe Val Asn Met He 
5 230 235 240 

ATC AGT CAG TGA TGT TTC ATA CTA TCA TAA AAC TGA AAT GTA CTA CAA 828 
Ser Val Ser Asp Val Ser Tyr Tyr His Lys Thr Glu Met Tyr Tyr Asn 
245 250 255 

CCA AAT TGT GAT GTC ATC CAC AAA GGT TAT TAT CAT TTA TGG GGA AAC 876 
Gin He Val Met Ser Ser Thr Lys Val He He He Tyr Gly Glu Thr 
260 265 270 

AAA CAG TAT TAT TGA ATT GAG CTT CAG AAT GTG GTC ATC TCC AGT TAA 924 
Asn Ser He He Glu Leu Ser Phe Arg Met Trp Ser Ser Pro Val Lys 
275 280 285 

ACA GAG AAT ATG GGT CAC CAC AAA ACA ATT TGA TTG CCC TAC CAG TAA 972 
Gin Arg He Trp Val Thr Thr Lys Gin Phe Asp Cys Pro Thr Ser Lys 
290 295 300 30 

GAG AGA CTT AAC TCA TGG CAC ATT CTA TGG GAC CCT TAC ATT TCT ACA 1020 
Arg Asp Leu Thr His Gly Thr Phe Tyr Gly Thr Leu Thr Phe Leu His 
5 310 315 320 

CCA CTA TGG TGA GAT TTC TGG CTT TAA AAA TTT TGT ACA GAC ACG GTA 1068 
His Tyr Gly Glu He Ser Gly Phe Lys Asn Phe Val Gin Thr Arg Tyr 
325 330 335 

CAA TCT CAG AAG CAC AGA TTT ATA TCT AGT AAT GCC AGA GTG GAA ATA 1116 
Asn Leu Arg Ser Thr Asp Leu Tyr Leu Val Met Pro Glu Trp Lys Tyr 
340 345 350 

TTT TAA CTA TGA AGC CTC AGC ATC TAA CTG TAA AAT ACT GAG AAA CTA 1164 
Phe Asn Tyr Glu Ala Ser Ala Ser Asn Cys Lys He Leu Arg Asn Tyr 
355 360 365 

TTT ATC CAA TAT CTC ACT GGA ATG GCT AAT GGA ACA GAA ATT TGA CAT 1212 
Leu Ser Asn He Ser Leu Glu Trp Leu Met Glu Gin Lys Phe Asp Met 
370 375 380 38 



GTC ATT TAG TGA TTA TAG TCA CAA CAT ATA CAA TGC TGT ATA TGC CAT 
Ser Phe Ser Asp Tyr Ser His Asn He Tyr Asn Ala Val Tyr Ala He. 



1260 
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5 390 395 400 

TGC TCA TGC ACT CCA TGA GAA GAA TCT GCA AGA AGT TGA AAA TCA GGC 1308 
Ala His Ala Leu His Glu Lys Asn Leu Gin Glu Val Glu Asn Gin Ala 
405 410 415 

AAT AAA CAA TGC GAA AGG AGA AAA TAC TCA CTG CTT GAA GCT AAA CTC 1356 
lie Asn Asn Ala Lys Gly Glu Asn Thr His Cys Leu Lys Leu Asn Ser 
420 425 430 

ATT TCT GAG AAA GAC CCA CTT CAC TAA TTC TCT TGG GAA CAG AGT AAT 1404 
Phe Leu Arg Lys Thr His Phe Thr Asn Ser Leu Gly Asn Arg Val He 
435 440 445 

TAT GAA ACA GAG AGA AGT AGT GCA TGG AGA CTA TAA TAT TGT TCA CAT 1452 
Met Lys Gin Arg Glu Val Val His Gly Asp Tyr Asn He Val His Met 
450 455 460 46 

GTG GAA TTT CTC ACA ACG CCT TGG GAT TAA GGT GAA GAT AGG ACA ATT 1500 
Trp Asn Phe Ser Gin Arg Leu Gly lie Lys Val L^s He Gl" Gin Phe 
5 470 " * 475 J 2 480 

CAG CCC ACA TTT TCC ACA GGG TCA ACA GTT ACA CTT ATA - TGT AGA CAT 1548 
Ser Pro His Phe Pro Gin Gly Gin Gin Leu His Leu Tyr Val Asp Met 
485 490 495 

GAC TGA GTT GGC TAC AGG AAG TAG AAA GAT GCC ATC CTC AGT GTG CAG 15 96 
Thr Glu Leu Ala Thr Gly Ser Arg Lys Met Pro Ser Ser Val Cys Ser 
500 505 510 

TGC AGA TTG CCA TCC TGG ATT CAG AAG AAT CTG GAA GGA GGA AAT GGC 1644 
Ala Asp Cys His Pro Gly Phe Arg Arg He Trp Lys Glu Glu Met Ala 
515 520 525 

AGC CTG CTG TTT TGT TTG CAA CCC CTG CCC TGA AAA TGA AAT TTC TAA 1692 
Ala Cys Cys Phe Val Cys Asn Pro Cys Pro Glu Asn Glu He Ser Asn 
530 535 540 54 

TGA GAC GAT GGT GGT ATT TTG GGT CTT CGT GAA GCA CCA TGA CAC TCC 1740 
Glu Thr Met Val Val Phe Trp Val Phe Val Lys His His Asp Thr Pro 
5 550 555 560 

TAT TGT GAA GGC CAA TAA CAG AAT CCT CAG CTA CCT ATT AAT CGT GTC 1788 
He Val Lys Ala Asn Asn Arg He Leu Ser Tyr Leu Leu He Val Ser 
565 570 575 

ACT CAT GTT CTG TTT TCT GTG CTC CTT TTT CTT CAT TGG CTA TCC TAA 1836 
Leu Met Phe Cys Phe Leu Cys Ser Phe Phe Phe He Gly Tyr Pro Asn 
580 585 590 

CAG AGC AAC CTG TAT CTT ACA GCA AAT CAC ATT TGG AAT CTT CTT TAC 1884 
Arg Ala Thr Cys He Leu Gin Gin He Thr Phe Gly He Phe Phe Thr 
595 600 605 

TGT GGC TAT TTC CAC AGT TCT GGC CAA AAC AAT CAC TGT GGT TCT GGC 1932 
Val Ala He Ser Thr Val Leu Ala Lys Thr He Thr Val Val Leu Ala 
610 615 620 62 

TTT CAA AGT CAC AGA CCC AGG AAG ACA ATT AAG AAT CTT TTT GGT ATC 1980 
Phe Lys Val Thr Asp Pro Gly Arg Gin Leu Arg He Phe Leu Val Ser 
5 630 635 640 

GGG GAC ACC CAA CTA CAT TAT TCC CAT ATG TTC CCT ATT GCA ATG TAT 2028 
Gly Thr Pro Asn Tyr He He Pro He Cys Ser Leu Leu Gin Cys He 
645 650 655 
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TCT GTG TGC AAT CTG GCT AGC AGT TTC TCC TCC CTT TGT TGA TAT TGA 2076 
Leu Cys Ala lie Trp Leu Ala Val Ser Pro Pro Phe Val Asp He Asp 
660 665 670 

TGA ACA CTC TGA GCA TGG CCA CAT CAT CAT TGT GTG CAA CAA GGG CTC 2124 
Glu His Ser Glu His Gly His He He He Val Cys Asn Lys Gly Ser 
675 680 685 

CAT TAC TGC ATT CTA CTG TGT CCT GGG ATA CTT GGC CTG CCT GGC CTT 2172 
He Thr Ala Phe Tyr Cys Val Leu Gly Tyr Leu Ala Cys Leu Ala Phe 
690 695 700 70 

TGG AAG CTT CAC TAT AGC TTT CTT GGC AAA GAA CCT GCC TGA CAC ATT 2220 
Gly Ser Phe Thr He Ala Phe Leu Ala Lys Asn Leu Pro Asp Thr Phe 
5 710 715 720 

CAA CGA AGC CAA GTT CTT GAC CTT CAG CAT GCT AGT GTT CTG CGC TGT 2268 
Asn Glu Ala Lys Phe Leu Thr Phe Ser Met Leu Val Phe Cys Ala Val 
725 730 735 

CTG GGT CAC CTT CCT CCC TGT CTA CCA TAG CAC CAA GGG CAA GGT CAT 2316 
Trp Val Thr Phe Leu Pro Val Tyr His Ser Thr Lys Gly Lys Val Met 
740 745 750 

GGT TGC TGT GGA GAT CTT CTC CAT CTT GGC ATC TAG TGC AGG GAT GCT 2364 
Val Ala Val Glu He Phe Ser He Leu Ala Ser Ser Ala Gly Met Leu 
755 760 765 

GGG ATG CAT CTT TGC ACC CAA AGT TTA CAT CAT TTT AAT GAG ACC AGA 2412 
Gly Cys He Phe Ala Pro Lys Val Tyr He He Leu Met Arg Pro Asp 
770 775 780 78 

CAG AAA TTC GAT CCA CAA AAT CAG GGA GAA ATC ATA TTT C TGAAAAGGTA 2462 
Arg Asn Ser He His Lys He Arg Glu Lys Ser Tyr Phe 
5 790 795 

TTTCAGGAAT TCTGTCAAAT GTAAAGTTGA TACATACACC CCAAATATTT AGTTACAGAG 2522 

CATATATCTA GTTTTAGAAT CACTCTCACT GGTTCCTCTA GTTAAGCATA GAAGTACCAT 2582 

ATGTACTGAT CTTGCATATG TTGTCTATAA AATCTTACAA TCATTCATTT GCTTAGTATC 2642 

TTCTGGAAGA AGTAAAATTT TCAAATAACT AGTACAATTT TATTCATTAT TTTGCTTTCA 2702 

TGAGGATTTC CCCCTGGTAA CTTCAAATAA ATTTTATAAG TCAGTTGAAT ATATAACCTT 2762 

ACATAGAAAG TGAGTTCTAG GACAGACAGG GATTATACAT AGAAACAAAC TAACTAAAAA 2822 

TCAACAAAGA TGAAATCAGA ACACATTTTC TTATTTCCAG TAGGAACACA TACTTGACAG 2882 

AATACTGTCT TTTTTTCAGC TGCTCTTTAA GATATTGGCC AATAGTCTAA GCTGAAAATG 2942 

TTCTTTATCT ACTCTCAAAT ACAAAAATAT TATATCCAAC AATGGACAGA ATCTGAGAAC 3002 

TCCTGTGGTT GAGTTAGGGA ATAGTTGGAA GATACTGAGA AGGAGGTGAC CCATAGGAAT 3062 

ACAAAGCAGT CTCAACTAAC CTGGACAACC AAGGTCCCTC AGACACTGAG CCACTAACAA 3122 

GTCAGCCTAC TCCAGCTGTT ATGAGGCCCC CAAAACATAT GCAACATAGG ATTGCCTGGT 3182 

CCAGCCTCAG CAAGAGAATA CACACCTAAC CACAGAGAGA CTTCCCCAAG GGATTGGGGA 3242 

GGTCTGGGGT TTGGAGAGTT GCGGATTGTC CCTTGATGAT TGGAAGGAGG TATTGGATGA 3302 

GAATGAATCA GGGGGAAGAC TAGGAAGGGG ATAATGATGG AACTGTAAAA AAAAAAA 3359 

(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 798 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:48: 
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Met 


Val 


He 


Phe 


Phe 


Leu 


Leu 


Asn 


l 








5 








Met 


Asp 


Pro 


Arg 


Cys 


Phe 


Trp 


Lvs 








20 










Glu 


Val 


Leu 


Gly 


Met 


Thr 


Cvs 

2 *■* 


Ser 






35 










40 


Thr 


Met 


Asp 


Lys 


Asp 


Tyr 


Phe 


Asn 




50 










55 




Thr 


Thr 


Asn 


His 


LVS 


Tvr 


Ala 


Leu 


65 










70 






lie 


Asn 


Arcr 


Asn 


Pro 


Asp 


Leu 


Leu 










85 








Tyr Asn 


Leu 


Glv 


His 


^y D 


Asp 


Glv 








100 










Leu 


Phe 


Asn 


Pro 


Asn 


Asn 


His 


Leu 






115 










120 


Glu Gly 


He 


Met 


Cvs 


Leu 


Val 


Leu 




130 










135 




Ser 


Leu 


Tyr 


Leu 


Trp 


He 


Ser 


Val 


145 










150 






Leu 


Gin 


Leu 


Ser 


Tvr 


Glv 


Pro 


Phe 










165 








Gin 


Tyr 


Pro 


Tvr 


Leu 


Tvr 


Gin 


Met 








180 










Leu 


Ala 


Met 


Val 


Ser 


Phe 


He 


He 






195 










200 


Leu 


Phe 


He 


Ser 


Asp 


Asp 


Asp 


Gin 




210 










215 




Lys 


Lys 


Glu 


Ser 


Gin 


Thr 


Lvs 


Asp 


225 










230 






He 


Ser 


Val 


Ser 


Asp 


Val 


Ser 


Tvr 










245 








Asn 


Gin 


He 


Val 


Met 


Ser 


Ser 


Thr 








260 










Thr 


Asn 


Ser 


He 


He 


Glu 


Leu 


Ser 






275 










280 


Lys 


Gin 


Arg 


He 


Trp 


Val 


Thr 


Thr 




290 










295 




Lys 


Arg 


Asp 


Leu 


Thr 


His 


Glv 


Thr 


305 










310 






His 


His 


Tvr 


Gly 


Glu 


He 


Ser 


Glv 










325 








Tyr Asn 


Leu 


Arcr 


Ser 


Thr 


Asp 


Leu 








340 










Tyr 


Phe 


Asn 


Tvr 


Glu 


Ala 


Ser 


Ala 






355 










360 


Tyr 


Leu 


Ser 


Asn 


He 


Ser 


Leu 


Glu 




370 










375 




Met 


Ser 


Phe 


Ser 


Asp 


Tvr 


Ser 


His 


385 










390 






He 


Ala 


His 


Ala 


Leu 


His 


Glu 


Lvs 










405 






Ala 


He 


Asn 


Asn 


Ala 


Lys 


Gly 


Glu 








420 










Ser 


Phe 


Leu 


Arg 


Lys 


Thr 


His 


Phe 






435 










440 


He 


Met 


Lys 


Gin 


Arg 


Glu 


Val 


Val 




450 










455 




Met 


Trp 


Asn 


Phe 


Ser 


Gin 


Arg 


Leu 


465 










470 




Phe 


Ser 


Pro 


His 


Phe 


Pro 


Gin 


Gly 










485 






Met 


Thr 


Glu 


Leu 


Ala 


Thr 


Gly 


Ser 








500 








Ser 


Ala 


Asp 


Cys 


His 


Pro 


Gly 


Phe 
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He 


Pro 


Phe 


Leu 


Leu 


Ala 


Asn 


Phe 




10 










15 




He 


Asn 


Leu 


Asn 


Glu 


He 


Lys 


Asp 


25 










30 




Phe 


He 


Leu 


Glu 


Thr 


Val 


Gin 


Lys 










45 






Gin 


Thr 


Leu 


Asn 


Val 


Leu 


Asn 


Thr 








60 










Ala 


Leu 


Ala 


Phe 


Thr 


Val 


Asp Glu 






75 










80 


Pro 


Asn 


Met 


Ser 


Leu 


He 


He 


Lys 




90 










95 




Lys 


Thr 


Val 


Thr 


Thr 


Leu 


Ser 


Asp 


105 










Tin 






His 


Phe 


Pro 


Asn 


i-yr 


JUG U 


Cys 


Asn 










IOC 








Leu 


Thr 


Glv 


ri w 




Trp 


Arg 


Ala 








140 










Tyr 


Val 


Tvr 


Leu 


Ser 


Pro 


His 


Phe 






155 










i ^ *■» 


Tvr 


Ser 


He 


Phe 


Ser 


As 


Asn 


Glu 




170 










175 




Glv 


Pro 


Lys 


Asp 


Ser 


ser 


Leu 


Ala 


185 










190 






Tyr 


Phe 


Lvs 

2 w * 


Tro 


Asn 




Val 


Gly 










one 








Gly 


Asn 


Gin 


Phe 


Leu 


Ser 


Glu 


Leu 








220 










He 


Cys 


Phe 


Ala 


Phe 


Val 


Asn 


Met 






235 










240 


Tvr 


His 


Lys 


Thr 


Glu 


Met 


Tyr 


Tyr 




250 










255 




Lys 


Val 


He 


He 


He 


Tvr 


Gly Glu 


265 










270 






Phe 


Ara 


Met 


Trp 


Ser 


Ser 


Pro 


Val 










285 








Lys 


Gin 


Phe 


Asp 


Cys 


Pro 


Thr 


Ser 








300 










Phe 


Tvr 

j 


Glv 


Thr 


Leu 


Thr 


Phe 


Leu 






315 










320 


Phe 


Lys 


Asn 


Phe 


Val 


Gin 


Thr Arg 




330 










335 




Tvr 


Leu 


Val 


Met 


Pro 


Glu 


Trp 


Lys 


345 














Ser 


Asn 


Cvs 


Lvs 


He 


Leu 


Arg Asn 










365 








Trp 


Leu 


Met 


Glu 


Gin 


Lvs 


Phe 


Asp 








380 










Asn 


He 


Tvr 


Asn 


Ala 


Val 


Tyr 


Ala 






395 










400 


Asn 


Leu 


Gin 


Glu 


Val 


Glu 


Asn 


Gin 




410 










415 




Asn 


Thr 


His 


Cys 


Leu 


Lys 

2 w 


Leu 


Asn 


425 










430 






Thr 


Asn 


Ser 


Leu 


Gly 


Asn 


Arg Val 










445 








His 


Gly 


Asp 


Tyr 


Asn 


He 


Val 


His 








460 










Gly 


He 


Lys 


Val 


Lys 


He 


Gly Gin 






475 










480 


Gin 


Gin 


Leu 


His 


Leu 


Tyr 


Val 


Asp 




490 










495 




Arg 


Lys 


Met 


Pro 


Ser 


Ser 


Val 


Cys 


505 










510 






Arg 


Arg 


He 


Trp 


Lys 


Glu 


Glu 


Met 
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515 520 525 



Ala Ala 


Cys 


Cys 


Phe 


Val 


Cys 


Asn 


Pro 


Cys 

- 


Pro 


Glu 


Asn 


Glu 


He 


Ser 


530 










535 








540 










Asn Glu 


Thr 


Met 


Val 


Val 


Phe 


Trp 


Val 


Phe 


Val 


Lys 


His 


His 


Asp 


Thr 


545 








550 










555 








560 


Pro He 


Val 


Lys 


Ala 
565 


Asn 


Asn 


Arg 


He 


Leu 
570 


Ser 


Tyr 


Leu 


Leu 


He 
575 


Val 


Ser Leu 


Met 


Phe 


Cys 


Phe 


Leu 


Cys 


Ser 


Phe 


Phe 


Phe 


He 


Gly 


Tyr 


Pro 






580 










585 










590 




Asn Arg 


Ala 
595 


Thr 


Cys 


He 


Leu 


Gin 
600 


Gin 


He 


Thr 


Phe 


Gly 
605 


He 


Phe 


Phe 


Thr Val 


Ala 


He 


Ser 


Thr 


Val 


Leu 


Ala 


Lys 


Thr 


He 


Thr 


Val 


Val 


Leu 


610 










615 








620 










Ala Phe 


Lys 


Val 


Thr 


Asp 


Pro 


Gly 


Arg 


Gin 


Leu 


Arg 


He 


Phe 


Leu 


Val 


625 








630 










635 








640 


Ser Gly 


Thr 


Pro 


Asn 


Tyr 


He 


He 


Pro 


He 


Cys 


Ser 


Leu 


Leu 


Gin 


Cvs 








645 










650 










655 


He Leu 


Cys 


Ala 


He 


Trp 


Leu 


Ala 


Val 


Ser 


Pro 


Pro 


Phe 


Val 


Asp 


He 






660 










665 










670 




Asp Glu 


His 


Ser 


Glu 


His 


Gly 


His 


He 


He 


He 


Val 


_ J cs 


Asn 


Lys 


Gly 




675 










680 










685 




Ser He 


Thr 


Ala 


Phe 


Tyr 


Cys 


Val 


Leu 


Gly Tyr Leu 


Ala 


Cys 


Leu 


Ala 


690 










695 










700 








Phe Gly 


Ser 


Phe 


Thr 


He 


Ala 


Phe 


Leu 


Ala 


Lys 


Asn 


Leu 


Pro 


Asp 


Thr 


705 








710 










715 








720 


Phe Asn 


Glu 


Ala 


Lys 
725 


Phe 


Leu 


Thr 


Phe 


Ser 
730 


Met 


Leu 


Val 


Phe 


Cys 
735 


Ala 


Val Trp 


Val 


Thr 


Phe 


Leu 


Pro 


Val 


Tyr 


His 


Ser 


Thr 


Lys 


Gly 


Lys 


Val 






740 










745 








750 




Met Val 


Ala 


Val 


Glu 


He 


Phe 


Ser 


He 


Leu 


Ala 


Ser 


Ser 


Ala 


Gly Met 




755 










760 










765 








Leu Gly 


Cys 


He 


Phe 


Ala 


Pro 


Lys 


Val 


Tyr 


He 


He 


Leu 


Met 


Arg 


Pro 


770 










775 










780 








Asp Arg 


Asn 


Ser 


He 


His 


Lys 


He 


Arg 


Glu 


Lys 


Ser 


Tyr 


Phe 







785 790 795 

(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 3012 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(ix) FEATURE: 

(A) NAME /KEY: Coding Sequence 

(B) LOCATION: 3... 2087 

(D) OTHER INFORMATION: GOVN13B 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

AT GTC TAC CTG TCT CCA CAT TTC CTT CAG CTT TCC TAT GGA CCT TTC 47 
Val Tyr Leu Ser Pro His Phe Leu Gin Leu Ser Tyr Gly Pro Phe 
1 5 10 15 

TAC TCC ATC TTC AGT GAT AAT GAA CAA TAT CCT TAT CTC TAT CAG ATG 95 
Tyr Ser He Phe Ser Asp Asn Glu Gin Tyr Pro Tyr Leu Tyr Gin Met 
20 25 30 

GGC CCA AAG GAC TCA TCA CTA GCA TTG GCA ATG GTC TCC TTC ATA ATT 143 
Gly Pro Lys Asp Ser Ser Leu Ala Leu Ala Met Val Ser Phe He He 
35 40 45 



WO 99/00422 PCT/US98/13680 

- 163 - 

TAC TTC AAG TGG AAC TGG GTT GGG CTA TTT ATC TCA GAT GAT GAT CAA 191 
Tyr Phe Lys Trp Asn Trp Val Gly Leu Phe lie Ser Asp Asp Asp Gin 
50 55 60 

GGC AAT CAA TTT CTC TCA GAG TTG AAA AAA GAG AGC CAA ACC AAG GAT 239 . 

Gly Asn Gin Phe Leu Ser Glu Leu Lys Lys Glu Ser Gin Thr Lys Asp 
65 70 75 

ATT TGC TTT GCC TTT GTG AAC ATG ATA TCA GTC AGT GAT GTT TCA TAC 287 
He Cys Phe Ala Phe Val Asn Met He Ser Val Ser Asp Val Ser Tyr 
80 85 90 95 

TAT CAT AAA ACT GAA ATG TAC TAC AAC CAA ATT GTG ATG TCA TCC ACA 335 
Tyr His Lys Thr Glu Met Tyr Tyr Asn Gin He Val Met Ser Ser Thr 
100 105 no 

AAG GTT ATT ATC ATT TAT GGG GAA ACA AAC AGT ATT ATT GAA TTG AGC 383 
Lys Val He He He Tyr Gly Glu Thr Asn Ser He He Glu Leu Ser 
115 120 125 

TTC AGA ATG TGG TCA TCT CCA GTT AAA CAG AGA ATA TGG GTC ACC ACA 431 
Phe Arg Met Trp Ser Ser Pro Val Lys Gin Arg He Trp Val Thr Thr 
130 135 140 

AAA CAA TTT GAT TGC CCT ACC AGT AAG AGA GAC TTA ACT CAT GGC ACA 479 
Lys Gin Phe Asp Cys Pro Thr Ser Lys Arg Asp Leu Thr His Gly Thr 
145 150 155 

TTC TAT GGG ACC CTT ACA TTT CTA CAC CAC TAT GGT GAG ATT TCT GGC 527 
Phe Tyr Gly Thr Leu Thr Phe Leu His His Tyr Gly Glu He Ser Gly 
160 165 170 175 

TTT AAA AAT TTT GTA CAG ACA CGG TAC AAT CTC AGA AGC ACA GAT TTA 575 
Phe Lys Asn Phe Val Gin Thr Arg Tyr Asn Leu Arg Ser Thr Asp Leu 
180 185 190 

TAT CTA GTA ATG CCA GAG TGG AAA TAT TTT AAC TAT GAA GCC TCA GCA 623 
Tyr Leu Val Met Pro Glu Trp Lys Tyr Phe Asn Tyr Glu Ala Ser Ala 
195 200 205 

TCT AAC TGT AAA ATA CTG AGA AAC TAT TTA TCC AAT ATC TCA CTG GAA 671 
Ser Asn Cys Lys He Leu Arg Asn Tyr Leu Ser Asn He Ser Leu Glu 
210 215 220 

TGG CTA ATG GAA CAG AAA TTT GAC ATG TCA TTT AGT GAT TAT AGT CAC 719 
Trp Leu Met Glu Gin Lys Phe Asp Met Ser Phe Ser Asp Tyr Ser His 
225 230 235 

AAC ATA TAC AAT GCT GTA TAT GCC ATT GCT CAT GCA CTC CAT GAG AAA 767 
Asn He Tyr Asn Ala Val Tyr Ala He Ala His Ala Leu His Glu Lys 
240 245 250 255 

GAT CTG CAA GAA TTT GAA AAT CAG GCA ATA AAC AAT GCG AAA GGA GAA 815 
Asp Leu Gin Glu Phe Glu Asn Gin Ala He Asn Asn Ala Lys Gly Glu 
260 265 270 

AAT ACT CAC TGC TTG AAG CTA AAC TCA TTT CTG AGA AAG ACC CAC TTC 863 
Asn Thr His Cys Leu Lys Leu Asn Ser Phe Leu Arg Lys Thr His Phe 
275 280 285 

ACT AAT TCT CTT GGG AAC AGA GTA ATT ATG AAA CAG AGA GAA GTA GTG 911 
Thr Asn Ser Leu Gly Asn Arg Val He Met Lys Gin Arg Glu Val Val 
290 295 300 



CAT - GGA GAC TAT AAT ATT GTT CAC ATG TGG AAT TTC TCA CAA CGC CTT 



959 
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His Gly Asp Tyr Asn lie Val His Met Trp Asn Phe Ser Gin Arg Leu 

305 310 315 

GGG ATT AAG GTG AAG ATA GGA CAA TTC AGC CCA CAT TTT CCA CAG GGT 1007 
Gly He Lys Val Lys He Gly Gin Phe Ser Pro His Phe Pro Gin Gly 
320 325 330 335 

CAA CAG TTA CAC TTA TAT GTA GAC ATG ACT GAG TTG GCT ACA GGA AGT 1055 
Gin Gin Leu His Leu Tyr Val Asp Met Thr Glu Leu Ala Thr Gly Ser 
340 345 350 

AGA AAG ATG CCA TCC TCA GTG TGC AGT GCA GAT TGC CAT CCT GGA TTC 1103 
Arg Lys Met Pro Ser Ser Val Cys Ser Ala Asp Cys His Pro Gly Phe 
355 360 365 

AGA AGA ATC TGG AAG GAG GAA ATG GCA GCC TGC TGT TTT GTT TGC AAC 1151 
Arg Arg He Trp Lys Glu Glu Met Ala Ala Cys Cys Phe Val Cys Asn 
370 375 380 

CCC TGC CCT GAA AAT GAA ATT TCT AAT GAG ACG AAT ATG GAT CAG TGT 1199 
Pro Cys Pro Glu Asn Glu He Ser Asn Glu Thr Asn Met Asp Gin Cys 
385 390 395 

GCG AAT TGT CCA GAA TAC CAG TAT GCC AAC ACA GAA AAG AAC AAA TGC 124 7 
Ala Asn Cys Pro Glu Tyr Gin Tyr Ala Asn Thr Glu Lys Asn Lys Cys 
400 405 410 415 

ATC CAG AAA GGT GTG ATT GTT CTA AGC TAT GAA GAC CCC TTG GGG ATG 12 95 
He Gin Lys Gly Val He Val Leu Ser Tyr Glu Asp Pro Leu Gly Met 
420 425 430 

GCT CTT GCC TTA ATA GCA TTC TGT TTC TCT GCA TTC ACA GTG GTG GTA 1343 
Ala Leu Ala Leu He Ala Phe Cys Phe Ser Ala Phe Thr Val Val Val 
435 440 445 

TTT TGG GTC TTC GTG AAG CAC CAT GAC ACT CCT ATT GTG AAG GCC AAT 13 91 
Phe Trp Val Phe Val Lys His His Asp Thr Pro He Val Lys Ala Asn 
450 455 460 

AAC AGA ATC CTC AGC TAC CTA TTA ATC GTG TCA CTC ATG TTC TGT TTT . 1439 
Asn Arg He Leu Ser Tyr Leu Leu He Val Ser Leu Met Phe Cys Phe 
465 470 475 

CTG TGC TCC TTT TTC TTC ATT GGC TAT CCT AAC AGA GCA ACC TGT ATC 14 87 
Leu Cys Ser Phe Phe Phe He Gly Tyr Pro Asn Arg Ala Thr Cys He 
480 485 490 495 

TTA CAG CAA ATC ACA TTT GGA ATC TTC TTT ACT GTG GCT ATT TCC ACA 1535 
Leu Gin Gin He Thr Phe Gly He Phe Phe Thr Val Ala He Ser Thr 
500 505 510 

GTT CTG GCC AAA ACA ATC ACT GTG GTT CTG GCT TTC AAA GTC ACA GAC 1583 
Val Leu Ala Lys Thr He Thr Val Val Leu Ala Phe Lys Val Thr Asp 
515 520 525 

CCA GGA AGA CAA TTA AGA ATC TTT TTG GTA TCG GGG ACA CCC AAC TAC 1631 
Pro Gly Arg Gin Leu Arg He Phe Leu Val Ser Gly Thr Pro Asn Tyr 
530 535 540 

ATT ATT CCC ATA TGT TCC CTA TTG CAA TGT ATT CTG TGT GCA ATC TGG 1679 
He He Pro He Cys Ser Leu Leu Gin Cys He Leu Cys Ala He Trp 
545 550 555 



CTA GCA GTT TCT CCT CCC TTT GTT GAT ATT GAT GAA CAC TCT GAG CAT 
Leu Ala Val Ser Pro Pro Phe Val Asp He Asp Glu His Ser Glu His 



1727 
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560 565 570 575 

GGC CAC ATC ATC ATT GTG TGC AAC AAG GGC TCC ATT ACT GCA TTC TAG 1775 
Gly His lie lie lie Val Cys Asn Lys Gly Ser lie Thr Ala Phe Tyr 
580 585 590 

TGT GTC CTG GGA TAC TTG GCC TGC CTG GCC TTT GGA AGC TTC ACT ATA 1823 
Cys Val Leu Gly Tyr Leu Ala Cys Leu Ala Phe Gly Ser Phe Thr lie 
595 600 605 

GCT TTC TTG GCA AAG AAC CTG CCT GAC ACA TTC AAC GAA GCC AAG TTC 1871 
Ala Phe Leu Ala Lys Asn Leu Pro Asp Thr Phe Asn Glu Ala Lys Phe 
610 615 620 

TTG ACC TTC AGC ATG CTA GTG TTC TGC GCT GTC TGG GTC ACC TTC CTC 1919 
Leu Thr Phe Ser Met Leu Val Phe Cys Ala Val Trp Val Thr Phe Leu 
625 630 635 

CCT GTC TAC CAT AGC ACC AAG GGC AAG GTC ATG GTT GCT GTG GAG ATC 1967 
Pro Val Tyr His Ser Thr Lys Gly Lys Val Met Val Ala Val Glu lie 
640 645 - * • ' 650 655 

TTC TCC ATC TTG GCA TCT AGT GCA GGG ATG CTG GGA TGC ATC TTT GCA 2015 
Phe Ser lie Leu Ala Ser Ser Ala Gly Met Leu Gly Cys lie Phe Ala 
660 665 670 

CCC AAA GTT TAC ATC ATT TTA ATG AGA CCA GAC AGA AAT TCG ATC CAC 2063 
Pro Lys Val Tyr lie lie Leu Met Arg Pro Asp Arg Asn Ser lie His 
675 680 685 

AAA ATC AGG GAG AAA TCA TAT TTC TGAAAAGGTA TTTCAGGAAT TCTGTCAAAT 2117 
Lys lie Arg Glu Lys Ser Tyr Phe 
690 695 

GTAAAGTTGA TACATACACC CCAAATATTT AGTTACAGAG CATATATCTA GTTTTAGAAT 2177 

CACTCTCACT GGTTCCTCTA GTTATGCATA GAAGTACCAT ATGTACTGAT CTTGCATATG 2237 

TTGTCTATAA AATCTTACAA TCATTCATTT GCTTAGTATC TTCTGGAAGA AGTAAAATTT 2297 

TCAAATAACT AGTACAATTT TATTCATTAT TTTGCTTTCA TGAGGATTTC CCCCTGGTAA 2357 

CTTCAAATAA ATTTTATAAG TCAGTTGAAT ATATAACCTT ACATAGAAAG TGAGTTCTAG 2417 

GACAG ACA GG GATTATACAT AGAAACAAAC TAACTAAAAA TCAACAAAGA TGAAATCAGA 2477 

ACACA TTT TC TTATTTCCAG TAGGAACACA TACTTGACAG AATACTGTCT TTTTTTCAGC 2537 

TGCTCTTTAA GATATTGGCC AATAGTCTAA GCTGAAAATG TTCTTTATCT ACTCTCAAAT 2597 

ACAAAAATAT TATATCCAAC AATGGACAGA ATCTGAGAAC TCCTGTGGTT GAGTTAGGGA 2657 

ATAGTTGGAA GATACTGAGA AGGAGGGTGA CCCATAGGAA TACAAAGCAG TCTCAACTAA 2717 

CCTGGACAAC CAAGGTCCCT CAGACACTGA GCCACTAACA AGTCAGCCTA CTCCAGCTGT 2777 

TATGAGGCCC CCAAAACATA TGCAACATAG GATTGCCTGG TCCAGCCTCA GCAAGAGAAT 2837 

ACACACCTAA CCACAGAGAG ACTTCCCCAA GGGATTGGGG AGGTCTGGGG TTTGGAGAGT 2 897 

TGCGGATTGT CCCTTGATGA TTGGAAGGAG GTATTGGATG AGAATGAATC AGGGGGAAGA 2957 

CTAGGAAGGG GATAATGATG GAACTGTAAA AAAAATTAAA AAAAAAAAAA AAAAA 3012 



(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 695 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

Val Tyr Leu Ser Pro His Phe Leu Gin Leu Ser Tyr Gly Pro Phe Tyr 
1 5 10 15 
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Ser 


He 


Phe 


Ser 


Asp 


Asn 


m ii 

VJXU 


m « 

UXll 


















Pro 


Lys 


Asp 


Ser 


Ser 


Leu 


Ala 


Leu 


















Phe 


Lys 


Trn 


Asn 




Val 


Gly 


Leu 




50 










c c 




Asn 


Gin 


Phe 


XJC3U 


9C1 


Glu 




Lys 


65 










1 V 




Cys 


Phe 


Ala 


Phe 


Val 


Asn 


Met 


He 










O 9 








His 


Lys 


Thr 


Glu 


Met 


Tvr 


Tvr 


Asn 








i tin 










Val 


He 


He 


He 


Tvr 


Gly 


Glu 


TVir 

X 111 






11D 










1 O A 

120 


Arg Met 


iirp 


Car 


oer 


Prn 


Val 
V dl 


Lys 




130 










1J3 




Gin 


Phe 


Asp 








Ser 


Lys 


145 










1 1? A, 






Tyr 


Gly 


X.XX1 


_ 

lieu 


TVir 


XrlxcS 


Leu 


XTlS 










165 








Lys 


Asn 


Phe 


Val 


Olll 


XXXI 


Arg 


Tyr 








1 AO 










Leu 


Val 


1T1CS U. 


Pr-rt 
iri o 




Trn 

irp 


Lys 


Tyr 
















*> a a 


Asn 


Cys 


Lys 


11c 


Leu 


Arg 


Asn 


Tyr 




210 










o*i c 
215 




Leu 


Met 


Glu 


will 


Lys 


Jrxxe 


Asp 


rlcb 


225 










Tift 






He 


Tyr 


Ion 
noil 




val 


Tvr 

ryr 


* i — 


lie 


















Leu 


Gin 


Glu 




m ii 

V3XU 


Asn 


m n 

V7lll 


nXa 


















Thr 


His 




Leu 


ys 


eu 


Asn 


Ser 
















•5 o n 


Asn 


Ser 


Leu 




Asn 


Arg 


Veil 


lie 




290 














Gly Asp 


Tvr 


Asn 


He 


Val 


His 


Met 


305 










Jlv 






He 


Lys 


Val 


Lys 


He 


Gtlv 


Gin 


XrXXG 










*1 *5 C 








Gin 


Leu 


His 


L6U 


Tvrr 


Ua 1 
Veil 


Asp 




















Lys 


Met 


Pro 


Ser 


Ser 


Val 


\* y ft* 


Ser 
















JoU 


Arg 


He 




XJy O 


m ii 


m ii 


new 


Ala 




370 














Cys 


Pro 


Glu 


Asn 


Glu 


He 


Ser 


Asn 


385 
















Asn 


Cys 


Pro 




Tyr 


uxn 


Tyr 


*Ha 


















Gin 


Lys 


Glv 


Val 


He 


Val 


i«eu 


Cor 

OCl 


















Leu 


Ala 


Leu 


He 


Ala 


Phe 




Phe 






435 










440 


Trp Val 


Phe 


Val 


Lys 


His 


His 


Asp 




450 










455 




Arg 


He 


Leu 


Ser 


Tyr 


Leu 


Leu 


He 


465 










470 






Cys 


Ser 


Phe 


Phe 


Phe 


He 


Gly 


Tyr 










465 








Gin 


Gin 


He 


Thr 


Phe 


Gly 


He 


Phe 








500 










Leu 


Ala 


Lys 


Thr 


He 


Thr 


Val 


Val 






515 










520 


Gly Arg 


Gin 


Leu 


Arg 


He 


Phe 


Leu 
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Tyr 


Pro 


Tyr 


Leu 


Tyr 


uin 


Met 


Gly 


*> c 










3 0 






Ala 




Val 


Ser 


rile 


lie 


lie 


Tyr 










45 








XT llt£ 


Tl A 

ne 


Ser 


TV 

ASp 


Asp 


Asp 


Gin 


Gly 








d0 










Lys 




Ser 




Thr 


Lys 


Asp 


He 






75 










80 


ser 


Val 


Ser 


Asp 


vai 


Ser 


Tyr 


Tyr 














95 






Tl A 

ne 


vai 


Met 


Ser 


Ser 


Thr 


Lys 












110 




Asn 


Ser 


lie 


He 


Glu 


Leu 


Ser 


Phe 










125 








bin 


Arg 


lie 


Trp 


Val 


Thr 


Thr 


Lys 








140 










Arg 


Asp 


Leu 


Thr • 


His 


Gly 


Thr 


Phe 






155 










160 


nia 


Tyr 


Giy 


Glu 


He 


Ser 


Gly 


Phe 














175 




Asn 


Leu 


Arg 


Ser 


Thr 


Asp 


Leu 


Tyr 


IOC 

185 










190 




rile 


Asn 


Tyr 


Glu 


Ala 


Ser 


Ala 


Ser 










205 








Leu 


Ser 


Asn 


Tl «. 

lie 


Ser 


Leu 


Glu 


Trp 








220 








Ser 


Pne 


Ser 


Asp 


Tyr 


Ser 


His 


Asn 






235 










240 


Aia 


HIS 


Ala 


Leu 


His 


Glu 


Lys 


Asp 




250 










255 




Tl #a> 


Asn 


Asn 


Aia 


Lys 


Gly 


Glu 


Asn 


265 










270 






jrne 


Leu 


Arg 


Lys 


Thr 


His 


Til* a 

Pne 


Thr 










285 










Lys 


uin 


Arg 


Glu 


val 


Val 


His 








300 










Trp 


Asn 


xr lie 






Arg 


Leu 


uiy 






315 










320 


Cor 


Pro 


XI 1 S 


irXie 


pro 


Gin 


Gly 


Gin 




*i *i A 










335 




inr 




Leu 


Ala 


Thr 


Gly 


Ser 


Arg 


345 










350 






TV 1 — 

Aia 


Asp 


Cys 


HIS 


Pro 


Gly 


Phe 


Arg 










365 








Kin 

Aia 


Cys 


Cys 


Pne 


val 


Cys 


Asn 


Pro 








380 










urlU. 


xnr 


Asn 


Met 


ASp 


Gin 


Cys 


Ala 






395 










400 


Asn 


Tnr 


G1U 


Lys 


Asn 


Lys 


Cys 


He 




410 










415 




Tyr 


r"T ii 

ulU 


Asp 


Pro 


Leu 


Gly 


Met 


Ala 


425 










430 






Ser 


Ala 


XtXXC 


Thr 
nil 


vai 


Val 
vai 


val 


XrlXC 










445 








Thr 


Pro 


He 


Val 


Lys 


Ala 


Asn 


Asn 








460 










Val 


Ser 


Leu 


Met 


Phe 


Cys 


Phe 


Leu 






475 










480 


Pro 


Asn 


Arg 


Ala 


Thr 


Cys 


He 


Leu 




490 










495 




Phe 


Thr 


Val 


Ala 


He 


Ser 


Thr 


Val 


505 










510 






Leu 


Ala 


Phe 


Lys 


Val 


Thr 


Asp 


Pro 










525 








Val 


Ser 


Gly 


Thr 


Pro 


Asn 


Tyr 


He 
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167 





530 










535 








540 










He 


Pro 


He 


Cya 


Ser 


Leu 


Leu 


Gin 


Cys 


He 


Leu Cys 


Ala 


He 


Trp 


Leu 


545 










550 










555 








560 


Ala 


Val 


Ser 


Pro 


Pro 


Phe 


Val 


Asp 


He 


Asp 


Glu His 


Ser 


Glu 


His 


Gly 










565 










570 








3/3 


His 


He 


He 


He 


Val 


Cvs 


Asn 


Lvs 


Glv 


Ser 


He Thr 


Ala 


Phe 




Cys 








580 










585 












Val 


Leu 


Gly Tyr 


Leu 


Ala 


Cys 


Leu 


Ala 


Phe 


Gly Ser 


Phe 


Thr 


He 


Ala 






595 










600 








O U 3 








Phe 


Leu 


Ala 


Lys 


Asn 


Leu 


Pro 


Asp 


Thr 


Phe 


Asn Glu 


Ala 


Lys 


Phe 


Leu 




610 










615 








620 








Thr 


Phe 


Ser 


Met 


Leu 


Val 


Phe 


Cys 


Ala 


Val 


Trp Val 


Thr 


Phe 


Leu 


Pro 


625 










630 










635 








640 


Val 


Tyr 


His 


Ser 


Thr 
645 


Lys 


Gly 


Lys 


Val 


Met 
650 


Val Ala 


Val 


Glu 


He 
655 


Phe 


Ser 


He 


Leu 


Ala 
660 


Ser 


Ser 


Ala 


Gly 


Met 
665 


Leu 


Gly Cys 


He 


Phe 
670 


Ala 


Pro 


Lys 


Val 


Tyr 


He 


He 


Leu 


Met 


Arg 


Pro 


Asp 


Arg Asn 


Ser 


He 


His 


Lys 






675 










680 








685 






He 


Arg 
690 


Glu 


Lys 


Ser 


Tyr 


Phe 
695 



















(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 435 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

CAGACTCTGA GCTACACCCT CCTTGTCTCC CTCACACTCT GCTTTCTCTC TTCCTCGCTC 60 

TTCATCGGCC GCCCCAGCCC TGCCACCTGC CTCCTCTCAC AGACCACCTT TGCAGCTGTG 120 

TTCACAGTGG CTGTGTTTTT CTGCAGGGCC TTCCAGGCTA TAAGGCCAGA AAGCAGGATC 180 

CGAAAGTGGA TGGGTCCCCA AAAAACAAAT TCTGTTGTCT TCCTTTGCTC CTTTACCCAA 240 

GTGACCCTCT GTGGAATCTG GCTGGGGACA GAGCCTCCCT TCGTAAACAA GGACCCTCAG 300 

TTCATGCCTG GCTACATCAT TATCCAGTGT AATGAGGGCT CCGTCACTGC CTTCTACTCT 360 

GTCTTGGGCT ACTTGGGCTT CTTGGTTTTA GGGTCCCTTG CTGTAGCCTT TCTGGCAAGG 420 

AACCTGCCTG ATGCT 435 

(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 145 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

Gin Thr Leu Ser Tyr Thr Leu Leu Val Ser Leu Thr Leu Cys Phe Leu 

15 10 15 

Ser Ser Ser Leu Phe He Gly Arg Pro Ser Pro Ala Thr Cys Leu Leu 

20 25 30 

Ser Gin Thr Thr Phe Ala Ala Val Phe Thr Val Ala Val Phe Phe Cys 

35 40 45 

Arg Ala Phe Gin Ala He Arg Pro Glu Ser Arg He Arg Lys Trp Met 

50 55 60 

Gly Pro Gin Lys Thr Asn Ser Val Val Phe Leu Cys Ser Phe Thr Gin 
65 70 75 80 
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Gly 
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Gly 
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90 




95 


Lys 


Asp 


Pro 


Gin 


Phe 


Met 
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Gly 
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He Gin Cys 
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100 










105 




110 




Gly 


Ser 


Val 


Thr 


Ala 


Phe 


Tyr 


Ser 


Val 


Leu Gly 


Tyr Leu Gly 


Phe Leu 






115 










120 






125 




Val 


Leu 


Gly 


Ser 


Leu 


Ala 


Val 


Ala 


Phe 


Leu Ala 


Arg Asn Leu 


Pro Asp 




130 










135 








140 



Ala 
145 

(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 474 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

CCCATTGTGA AGGCTAATAA CCAGACTCTG AGCTACACCC TCCTTGTCTC CCTCACACTC 60 

TGCTTTCTCT CTTCCTCGCT CTTCATCGGC CGCCCCAGCC CTGCCACCTG CCTCCTCTCA 120 

CAGACCACCT TTGCAGCTGT GTTCACAGTG GCTGTGTTTT CTGCAGGGCC TTCCAGGCTA 180 

TAAGGCCAGA AAGCAGGATC CGAAAGTGGA TGGGTCCCCA AAAAACAAAT TCTGTTGTCT 240 

TCCTTTGCTC CTTTACCCAA GTGACCCTCT GTGGAATCTG GCTGGGGACA GAGCCTCCCT 300 

TCGTAAACAA GGACCCTCAG TTCATGCCTG GCTACATCAT TATCCAGTGT AATGAGGGCT 360 

CCGTCACTGC CTTCTACTCT GTCTTGGGCT ACTTGGGCTT CTTGGTTTTA GGGTCCCTTG 420 

CTGTAGCCTT TCTGGCAAGG AACCCGCCAG ATACGTTCAA TGAGGCCAAG TTAA 474 

(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 338 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

ACTCCCATTG TGAAGGCCAA CAACTGCCAG CTCAGCTATC TCCTGCTGTC CTCCTTGGCC 60 

CTCAGCTTCC TCTGCCCCTT CATGTTCATT GGCCACCCAG ACCCCATCAC TTGTGCTGTG 120 

CACNAGGCAG ATTTTGGGGT CACCTTCATG GTCTGCACAT CCACTGTGCT GGCCAAGACC 180 

ATCGTGGTGG TGGCAGCCTT CCATGCCACC CAGGCAGACA CTCAGCTTAG GGGGTGGGCG 240 

GGGACAGTCC TCCTCAGCAC CATCCTCACT GTTCCCTGAC CCAGGCAGCC TTGTGTGCAC 3 00 

TCTGGGTGAC CAGATGGCCC CCTCAGCCTG TAAAATCT 338 

(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 182 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

AACCTNCCCG ATACNTTCAA TGAAGCCAAG TTCTTGATGT TCAGCATGCT GATGTTATGT 60 

ACTGTTTGAA TTACCTTCCA TACTGTGTAA CATAGCACCA AAGGGAAGGT CATGGTTGCC 120 

TTGGAAATAT TCTCCACCTT GACTTCCAGT GCTGAGTGCT AGGNTGTATC TTCGCNCCAA 180 
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(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 
ATTGGATCCA GGCCGCTCTG GACAAAATAT GAATTCT 3 7 

(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 
GGCACATGGA CGAAATCTTG GTACTCTTCA GAATTCT 37 
(2) INFORMATION FOR SEQ ID NO: 58: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 51 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 
(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

Asn Met Asp Gin Cys Ala Asn Cys Pro Glu Tyr Gin Tyr Ala Asn Thr 

1 5 10 15 

Glu Lys Asn Lys Cys lie Gin Lys Gly Val lie Val Leu Ser Tyr Glu 

20 25 30 

Asp Pro Leu Gly Met Ala Leu Ala Leu lie Ala Phe Cys Phe Ser Ala 

35 40 45 

Phe Thr Val 
50 

(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1079 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

Met Ala Ser Tyr Ser Cys Cys Leu Ala Leu Leu Ala Leu Ala Trp His 

15 10 15 

Ser Ser Ala Tyr Gly Pro Asp Gin Arg Ala Gin Lys Lys Gly Asp lie ■ 
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25 

His Phe Gly Val 

Ser Val Glu Cys 
60 

Ala Met He Phe 
75 

Pro Asn Met Thr 
90 

Ser Lys Ala Leu 
105 

Asp Ser Leu Asn 

Ser Thr He Ala 
140 

Val Ala Asn Leu 
155 

Ser Ser Ser Arg 
170 

Arg Thr He Pro 
185 

He Glu Tyr Phe 

Asp Tyr Gly Arg 
220 

Arg Asp He Cys 
235 

Glu Glu Glu He 
250 

Lys Val He Val 
265 

Lys Glu He Val 

Glu Ala Trp Ala 
300 

Val Val Gly Gly 
315 

Gly Phe Arg Glu 
330 

Asn Gly Phe Ala 
345 

Gin Glu Gly Ala 

His Glu Glu Gly 
380 

Pro Leu Cys Thr 
395 

Met Asp Tyr Glu 
410 

Tyr Ser lie Ala 
425 

Arg Gly Leu Phe 

Ala Trp Gin Val 
460 

Met Gly Glu Gin 
475 

Tyr Ser He He 
490 

Phe Lys Glu Val 
505 

Leu Phe He Asn 

Val Pro Phe Ser 
540 



30 

Ala Ala Lys Asp 
45 

He Arg Tyr Asn 

Ala He Glu Glu 
80 

Leu Gly Tyr Arg 
95 

Glu Ala Thr Leu 
110 

Leu Asp Glu Phe 
125 

Val Val Gly Ala 

Leu Gly Leu Phe 
160 

Leu Leu Ser Asn 
175 

Asn Asp Glu His 
190 

Arg Trp Asn Trp 
205 

Pro Gly He Glu 

He Asp Phe Ser 
240 

Gin Gin Val Val 
255 

Val Phe Ser Ser 
270 

Arg Arg Asn He 
285 

Ser Ser Ser Leu 

Thr He Gly Phe 
320 

Phe Leu Gin Lys 
335 

Lys Glu Phe Trp 
350 

Lys Gly Pro Leu 
365 

Gly Asn Arg Leu 

Gly Asp Glu Asn 
400 

His Leu Arg He 
415 

His Ala Leu Gin 
430 

Thr Asn Gly Ser 
445 

Leu Lys His Leu 

Val Thr Phe Asp 
480 

Asn Trp His Leu 
495 

Gly Tyr Tyr Asn 
510 

Glu Glu Lys lie 
525 

Asn Cys Ser Arg 
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Asp Cys Gin Ala Gly Thr Arg Lys Gly lie lie Glu Gly Glu Pro Thr 
545 550 555 560 

Cys Cys Phe Glu Cys Val Glu Cys Pro Asp Gly Glu Tyr Ser Gly Glu 

565 570 575 

Thr Asp Ala Ser Ala Cys Asp Lys Cys Pro Asp Asp Phe Trp Ser Asn 

580 585 590 

Glu Asn His Thr Ser Cys He Ala Lys Glu He Glu Phe Leu Ala Trp 

595 600 605 

Thr Glu Pro Phe Gly He Ala Leu Thr Leu Phe Ala Val Leu Gly He 

610 615 620 

Phe Leu Thr Ala Phe Val Leu Gly Val Phe He Lys Phe Arg Asn Thr 
625 630 635 640 

Pro He Val Lys Ala Thr Asn Arg Glu Leu Ser Tyr Leu Leu Leu Phe 

€45 650 655 

Ser Leu Leu Cys Cys Phe Ser Ser Ser Leu Phe Phe He Gly Glu Pro 

660 665 670 

Gin Asp Trp Thr Cys Arg Leu Arg Gin Pro Ala Phe Gly He Ser Phe 

675 680 685 

Val Leu Cys He Ser Cys He Leu Val Lys Thr Asn Arg Val Leu Leu 

690 695 700 

Val Phe Glu Ala Lys He Pro Thr Ser Phe His Arg Lys Trp Trp Gly 
705 710 715 720 

Leu Asn Leu Gin Phe Leu Leu Val Phe Leu Cys Thr Phe Met Gin He 

725 730 735 

Leu He Cys He He Trp Leu Tyr Thr Ala Pro Pro Ser Ser Tyr Arg 

740 745 750 

Asn His Glu Leu Glu Asp Glu He He Phe He Thr Cys His Glu Gly 

755 760 765 

Ser Leu Met Ala Leu Gly Ser Leu He Gly Tyr Thr Cys Leu Leu Ala 

770 775 780 

Ala He Cys Phe Phe Phe Ala Phe Lys Ser Arg Lys Leu Pro Glu Asn 
785 790 795 800 

Phe Asn Glu Ala Lys Phe He Thr Phe Ser Met Leu He Phe Phe He 

805 810 815 

Val Trp He Ser Phe He Pro Ala Tyr Ala Ser Thr Tyr Gly Lys Phe 

820 825 830 

Val Ser Ala Val Glu Val He Ala He Leu Ala Ala Ser Phe Gly Leu 

835 840 845 

Leu Ala Cys He Phe Phe Asn Lys Val Tyr He He Leu Phe Lys Pro 

850 855 860 

Ser Arg Asn Thr He Glu Glu Val Arg Ser Ser Thr Ala Ala His Ala 
865 870 875 880 

Phe Lys Val Ala Ala Arg Ala Thr Leu Arg Arg Pro Asn He Ser Arg 

885 890 895 

Lys Arg Ser Ser Ser Leu Gly Gly Ser Thr Gly Ser He Pro Ser Ser 

900 905 910 

Ser He Ser Ser Lys Ser Asn Ser Glu Asp Arg Phe Pro Gin Pro Glu 

915 920 925 

Arg Gin Lys Gin Gin Gin Pro Leu Ser Leu Thr Gin Gin Glu Gin Gin 

930 935 940 

Gin Gin Pro Leu Thr Leu His Pro Gin Gin Gin Gin Gin Pro Gin Gin 
9*5 950 955 960 

Pro Arg Cys Lys Gin Lys Val He Phe Gly Ser Gly Thr Val Thr Phe 

965 970 975 

Ser Leu Ser Phe Asp Glu Pro Gin Lys Asn Ala Met Ala His Arg Asn 

980 985 990 

Ser Met Arg Gin Asn Ser Leu Glu Ala Gin Arg Ser Asn Asp Thr Leu 

995 1000 1005 

Gly Arg His Gin Ala Leu Leu Pro Leu Gin Cys Ala Asp Ala Asp Ser 

1010 1015 1020 

Glu Met Thr He Gin Glu Thr Gly Leu Gin Gly Pro Met Val Gly Asp 
°25 1030 1035 1040 

His Gin Pro Glu Met Glu Ser Ser Asp Glu Met Ser Pro Ala Leu Val 

1045 1050 1055 

Met Ser Thr Ser Arg Ser Phe Val He Ser Gly Gly Gly Ser Ser Val . 
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1060 1065 1070 

Thr Glu Asn Val Leu His Ser 
1075 

(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : Modified Base 

(B) LOCATION: 3. . .3 

(D) OTHER INFORMATION: Inosine 



(A) NAME/KEY: Modified Base 

(B) LOCATION: 12 . . . 12 

(D) OTHER INFORMATION: Inosine 



(A) NAME/KEY: Modified Base 

(B) LOCATION: 15 . . . 15 

(D) OTHER INFORMATION: Inosine 



(A) NAME /KEY: Modified Base 

(B) LOCATION: 18. . .18 

(D) OTHER INFORMATION: Inosine 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 
BTNYAYCARR TNGCNMCNAA RGAYAC 26 
(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

( ix) FEATURE : 

<A) NAME/KEY: Modified Base 

(B) LOCATION: 6. . .6 

(D) OTHER INFORMATION: Inosine 



(A) NAME/KEY: Modified Base 

(B) LOCATION: 9... 9 

(D) OTHER INFORMATION: Inosine 



(A) NAME/KEY: Modified Base 

(B) LOCATION: 12... 12 

(D) OTHER INFORMATION: Inosine 



(A) NAME/KEY: Modified Base 

(B) LOCATION: 18... 18 
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(D) OTHER INFORMATION: Inosine 



(A) NAME/KEY: Modified Base 

(B) LOCATION: 21... 21 

(D) OTHER INFORMATION: Inosine 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:€l 
GYRTKNGCNR YNRCRTRNAC NRCRTT 

(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ixj FEATURE: 

(A) NAME/KEY: Modified Base 

(B) LOCATION: 3... 3 

(D) OTHER INFORMATION: Inosine 



(A) NAME/KEY: Modified Base 

(B) LOCATION: 9... 9 

(D) OTHER INFORMATION: Inosine 



(A) NAME/KEY: Modified Base 

(B) LOCATION: 12... 12 

(D) OTHER INFORMATION: Inosine 



(A) NAME/KEY: Modified Base 

(B) LOCATION: 13... 13 

(D) OTHER INFORMATION: Inosine 



(A) NAME/KEY: Modified Base 

(B) LOCATION: 24... 24 

<D) OTHER INFORMATION: Inosine 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:62 
MRNTGYCCNK ANNAYMARTA YGCNAA 

(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ix) FEATURE: 

(A) NAME/KEY: Modified Base 

(B) LOCATION: 2... 2 

(D) OTHER INFORMATION: Inosine 
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(A) NAME/KEY: Modified Base 

(B) LOCATION: 5... 5 

(D) OTHER INFORMATION: Inosine 



(A) NAME/KEY: Modified Base 

(B) LOCATION: 8... 8 

(D) OTHER INFORMATION: Inosine 



(A) NAME/KEY: Modified Base 

(B) LOCATION: 11... 11 

(D) OTHER INFORMATION: Inosine 



(A) NAME/KEY: Modified Base 

(B) LOCATION: 14... 14 

(D) OTHER INFORMATION: Inosine 



(A) NAME/KEY: Modified Base 

(B) LOCATION: 20... 20 

(D) OTHER INFORMATION: Inosine 



(A) NAME/KEY: Modified Base 

<B) LOCATION: 26. . .26 

(D) OTHER INFORMATION: Inosine 



(A) NAME/KEY: Modified Base 

(B) LOCATION: 29... 29 

<D) OTHER INFORMATION: Inosine 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 
GNCKNAYNAR NATNAYRTAN MWYTTNGGNA C 31 
(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

( ix) FEATURE : 

(A) NAME/KEY: Modified Base 

(B) LOCATION: 3... 3 

(D) OTHER INFORMATION: Inosine 



(A) NAME/KEY: Modified Base 

(B) LOCATION: 6... 6 

(D) OTHER INFORMATION: Inosine 



(A) NAME/KEY: Modified Base 

(B) LOCATION: 9. . .9 

(D) OTHER INFORMATION: Inosine 



(A) 
(B) 



NAME/KEY: Modified Base 
LOCATION: 12. . .12 
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(D) OTHER INFORMATION: Inosine 



(A) NAME/KEY: Modified Base 

(B) LOCATION: 16... 16 

(D) OTHER INFORMATION: Inosine 



(A) NAME/KEY: Modified Base 

(B) LOCATION: 24... 24 

(D) OTHER INFORMATION: Inosine 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64 
ATNWSNYTNR TNTTYNGYTT YYTNTG 

(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: Modified Base 

(B) LOCATION: 2... 2 

(D) OTHER INFORMATION: Inosine 



(A) NAME/KEY: Modified Base 

(B) LOCATION: 5... 5 

(D) OTHER INFORMATION: Inosine 



(A) NAME/KEY: Modified Base 

<B) LOCATION: 11 . . . 11 

(D) OTHER INFORMATION: Inosine 



(A) NAME/ KEY: Modified Base 

(B) LOCATION: 17... 17 

(D) OTHER INFORMATION: Inosine 



(A) NAME/KEY: Modified Base 

<B) LOCATION: 20... 20 

(D) OTHER INFORMATION: Inosine 



(A) NAME/KEY: Modified Base 

(B) LOCATION: 23... 23 

(D) OTHER INFORMATION: Inosine 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65 

RNATNSWRAA NAYYTCNACN RCNACCAT 

(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 26 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

( ix) FEATURE : 

(A) NAME/KEY: Modified Base 

(B) LOCATION ; 6. . .6 

<D) OTHER INFORMATION: Inosine 



(A) NAME/KEY: Modified Base 

(B) LOCATION: 9... 9 
(D) OTHER INFORMATION: Inosine 



(A) NAME/KEY: Modified Base 

(B) LOCATION: 12 . . . 12 
(D) OTHER INFORMATION: Inosine 



(A) NAME/KEY: Modified Base 

(B) LOCATION: 15. . .15 
(D) OTHER INFORMATION: Inosine 



(A) NAME/KEY: Modified Base 

(B) LOCATION: 21... 21 
(D) OTHER INFORMATION: Inosine 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 
GAYACNCCNA TNGTNAARGC NAAYAA 26 
(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: Modified Base 

(B) LOCATION: 3... 3 

(D) OTHER INFORMATION: Inosine 



(A) NAME/KEY: Modified Base 

(B) LOCATION: 6... 6 
(D) OTHER INFORMATION: Inosine 



(A) NAME/KEY: Modified Base 

(B) LOCATION: 12 . . . 12 
(D) OTHER INFORMATION: Inosine 
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(A) NAME/KEY: Modified Base 

(B) LOCATION: 15... 15 

<D) OTHER INFORMATION: Inosine 
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(A) NAME /KEY : Modified Base 

(B) LOCATION: 24 ... 24 

(D) OTHER INFORMATION: Inosine 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:67: 

AANGTNAYCC ANACNSWRCA RAANAC 26 
(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2550 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 

ATGAAGCAGC TCTGCGCTTT CACTATTTCT TTGTTGTTTC TGAAGTTTTC TCTCATCCTG 60 

TGCTGTTTGA CTGAACCAAG TTGCTTTTGG AGAATAAGGA ATAGTGAAGA TAGTGATGGA 120 

GATT TACAAA GGGAATGTCA TTTTTACCTT TGGAAAACTG ATGAACCTAT TGAAGATAGT 180 

TTTTATAATT ATGATTTAAG TTTTAGAATT GCAGCAAGTG AATATGAGTT TCTTCTCGTA 240 

ATGTTTTTTG CTATCGATGA GATCAACAGG AATCCTTATC TTTTACCCAA CATAACTTTG 300 

ATGTTCTCCT TCATTGGTGG AAACTGTCAG GATTTATTGA GAGTTATGGA CCAAGCATAT 360 

ACACAAATAA ATGGACATAT GAATTTTGTT AATTATTTCT GTTATTTAGA TGATTCATGT 420 

GCCATAGGTC TTACAGGACC ATCATGGAAA ACTTCCTTAA AACTGGCAAT GCACTCTTCG 480 

ATGCCACTGG TTTTCTTTGG ACCATTTAAT CCTAACCTAC GCGACCATGA CCGGCTGCCC 540 

CATGTCCATC AGGTAGCCCC CAAGGACACA CATTTGTCCC ATGGCATGGT CTCCTTGATG 600 

TTTCACTTTA G ATG GACTTG GATAGGACTG GTCATCTCAG ATGATGACCA GGGTATTCAG 660 

TTTCTCTCAG ATTTAAGAGA AGAAAGCCAA AGGCATGGGA TCTGTTTAGC TTTTGTTAAT 720 

ATGATCCCAG AAAACATGCA GATATACATG ACAAGGGCTA CAATATATGA TAAACACATT 780 

ATGACATCTT CAGCAAAGGT TGTTATCATT TATGGTGAAA TGAACTCTAC TCTAGAAGCA 840 

AGCTTTAGAA GATGGGAAGA GTTAGGTGCT CGGAGAATCT GGATCACAAC CTCACAATGG 900 

GATGTCATCA CAAATAAAAA AGACTTCACC CTTAATCTCT TCCATGGGAT CATCACTTTT 960 

GAACATCATA GATTTG AGA T TCCTAAATTA AATAAATTCA TGCAAACAAT GAACACTGCC 1020 

AAATACCCAG TAGATATTTC TCATACTATA TTGGAGTGGA ATTATTTTAA TTGTTCAATA 1080 

TCTAAGAACA GCATTAGAAT GCATCATATT ACATTCAACA ACACCTTGGA ATGGACATCA 1140 

CTGCACAACT ATGATGTGGC GATGAGTGAT GAAGGTTACA ATTTGTACAA TGCTGTTTAT 1200 

GCTGTGGCCC ACACCTACCA TGAATACATT TTTCAACAAG TAGAGTCTCA GAAAAAGGCA 1260 

AAA CCCAAAA GATATTTCAC TGCTTGTCAG CAGGTGTCTT CCTTGATGAA AACCAGGGTA 1320 

TTTACGAACC CTGTTGGAGA ACTGGTGAAC ATGAAG CAT A GGGAAAATCA GTGTACAGAG 1380 

TATGATATTT TCATCATTTG GAATTTTCCA CAAGGCCTTG GATTAAAAGT GAAAATAGGA 1440 

AGCTATTTAC CTTGTTTTCC ACAGAGACAA AAACTTCATA TATCTGATGA TTTGGAATGG 1500 

GCCAAGGGAG GAACATCACC TCAGGTTCCC TCCTCCGTGT GTAGTGTGGC ATGTACTGCT 1560 

GGATTCAGGA AAATTTATCA AAAAGAAACA GCAGACTGCT GCTTTGATTG TGTTCAGTGC 1620 

CCAGAAAATG AGATTTCCAA CGAAACAGAT ATGGAACAGT GTGTGAGGTG TCCAGATGAT 1680 

AAGTATGCCA ACATAGAGCA AACCCACTGC CTCTCAAGAG CTGTATCATT TCTGGCTTAT 174 0 

GAAGATTCAT TGGGG ATG GC TCTAGGCTGC ATGGCACTGT CCTTCTCAGC CATCACAATT 1800 

CTAATCCTCG TCACATTTGT GAAGTACAAA GATACTCCCA CTGTGAAGGC CAATAACCGC 1860 

ATTCTCAGCT ACATCCTGCT CATCTCTCTC GTCTTCTGCT TTCTCTGCTC CCTGCTCTTC 1920 

ATTGGACCTC CCGACCAGGT CACCTGCATC TTTCAGCAGA CCACATTTGG AGTATTGTTC 1980 

ACTGTGTCTG TTTCTACAGT GTTGGCCAAA ACAATAACTG TGGTCATGGC TTTCAAGCTC 2040 

ACTACTCCAG GAAGAAGGAT GAGAGGGATG ATGATGACAG GGGCACCTAA GTTGGTCATT 2100 

CCC ATTTGTA CCCTGATCCA ACTTGTTCTC TGTGGAATCT GGTTGGTCAC ATCTCCTCCC 2160 

TTTATTGACA GAGACATACA ATCTGAGCAT GGGAAGATTG TCATTCTTTG CAATAAAGGC 2220 

TCAGTCATTG CCTTCCACGT CGTCCTGGGA TACTTGGGCT CCTTGGCTCT GGGGAGCTTC 2280 

ACGTTGGCTT TCCTGGCTAG GAACCTTCCT GACACATTCA ATGAAG CCAA GTTCCTAACT 2340 

TTCAGCATGC TGGTGTTCTG CAGTGTCTGG ATCACCTTCC TCCCTGTCTA CCACAGCACC 2400 

AGGGGGAGGG TCATGGTGGT TGTGGAGGTT TTCTCCATCT TGGCTTCTAG TGCAGGGTTG 2460 

CTAATGTGTA TCTTTGTCCC AAAGTGTTAT GTTATTTTAA TTAGACCAGA TTCAAATTTT 2520 

ATAAAGAACC ACAAAGGTAA ATTG CTTT AT 2550 
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(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 2424 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:69: 

ATGAAGCAGC TCTGCACTTT CACTATTTCA TTGTTGTTTC TGAAGTTTTC TCTCATCTTG 60 

TGCTGTTGGA GTGAACCAAG CTGCTTTTGG AGGATAAAGA AGAGTGAAGA TAATGATGGA 120 

GATTTACAAA GGGAGTGTCA TTTTTACCTT TGGAAAACTG ATGAACCTAT TGAAGATAGT 180 

TTTTATAATT ATGATTTAAG TTTTAGAATT GCAGGAAGTG AATATGAGCT TCTTCTGGTA 240 

ATGTTTTTTG CTACTGATGA GATCAACAAG AATCCTTATC TTTTACCCAA CATGAGTTTG 300 

ATGTTCTCCA TCATTGGTGG AAACTGTCAT GATTTATTGA GAAGTCTGGA TCAAGAATAT 360 

GCACAAATAG ATGGACATAT GAATTTTGTT AATTATTTCT GTTATTTAGA TGATTCATGT 420 

GCCACAGGCC TTACAGGACC ATCATGGAAA ACATCCTTAA AACTGGCAAT GCATTCTTCA 480 

ATGCCACTGG TTTTCTTTGG ACCATTTAAT CCTAACCTAC GCGACCATGA CCGGCTGCCC 540 

CAT GT CCAT C AGGTAGCCCC CAAGGACACA CATTTGTCCC ATGGCATGGT CTCCTTGATG 600 

TTTCATTTTA GGTGGACTTG GATAGGACTG GTCATCTCAG ATGATGATCA GGGTATTCAG 660 

TTTCTCTCAG ATTTAAGAGA AGAAAGCCAA AGGCATGGGA TCTGTTTGGC TTTTGTTAAT 720 

ATGATCCCAG AAAACATGCA GATATACATG ACAAGGGCTA CAATATATGA TACACAAATT 780 

ATGACATCTT CAGCAAAGGT TGTTATCATT TATGGTGACA TGAACTCTAC TCTAGAAGCA 840 

AGCTTTAGAA GATGGGAAGA GTTAGGTGCT CGGAGAATCT GGATCACAAC CACACAATGG 900 

GATGTCATCA CAAATAAAAA AGACTTCACC CTTAATCTCT TCCATGGGAC TATTACTTTT 960 

GCACACCACA AAGATGAGAT TCCTAAATTT AGGAATTTTA TGCAAACAAA GAAAACTGCC 1020 

AAATACCTTG TAGATATTTC TCATACTATT TTGGAGTGGA ATTATTTTAA TTGTTCAATC 1080 

TCTAAGAACA GCAGTAAAAT GGGTCATTTT ACATTCAACA ACACATTGCA ATGGACAGCA 1140 

CTGCACAACT ATGATATGGC CCTGAGCGAT GAAGGTTACA ATTTGTATAA TGCTGTTTAT 1200 

GCTGTGGCCC ACACCTACCA TGAATACATT CTTCAACAAG TAGAGTCTCA GAAAAAGGCA 1260 

AAACCCAAAA GATATTTCAC TGCTTGTCAG CAGGTGTCTT CCTTGATGAA AACCAGGGTA 1320 

TTTATG AACC CTGTTGGAGA ACTGGTGAAC ATGAAGCATA GGGAAAATCA GTGTACAGAG 1380 

TATG ATAT TT TCATCATTTG GAATTTTCCA CAAGGCCTTG GATTAAAAGT GAAAGTAGGA 1440 

AGCTATTTAC CTTGCTTTCC AAAGAGTCAA CAACTTCATA TAGCTGATGA TTTGGAATGG 1500 

GCCAT GGGA G GAACATCAGT GGATATGGAA CAGTGTGTGA GATGTCCAGA TAATAAATAT 1560 

GCCAATTTAG AGCAAACCCA CTGCCTCCAA AGAACGGTGT GATTTCTGGC TTATGAAGAT 1620 

CCATTGGGGA TGGCTCTAGG CTGCATGGCA CTGTCCTTCT CGGCCATCAC AATTCTAGTC 1680 

CTCGTCACAT TTGTGAAGTA CAAGGATACT CCCATTGTGA AGGCCAATAA CCGCATTCTC 1740 

AGCTACATCC TGCTCATCTC TCTCGTCTTC TGCTTTCTCT GTTCCCTGCT CTTCATTGGA 1800 

CATCCCGACC AGGTCACCTG CATCTTGCAG CAGACCACAT TTGGAGTATT GTTCACTGTG 1860 

TCTGTTTCTA CAGTGTTGGC CAAAACAATA ACTGTGGTCA TGGCTTTCAA GCTCACTACT 1920 

CCAGGAAGAA GGATGAGAGG GATGATGATG ACAGGGGCAC CTAAGTTGGT CATTCCCATT 1980 

TGTACCCTGA TCCAACTTGT TCTCTGTGGA ATCTGGTTGG TCACATCTCC TCCCTTTATT 2040 

GACAGAGATA TACAATCTGA ACATGGGAAG ATTGTCATTC TTTGCAATAA AGGCTCTGTC 2100 

GTTGCCTTCC ACGTCGTCCT GGGATACTTG GGCTCCTTGG CTCTGGGGAG CTTCACTTTG 2160 

GCTTTCTTGG CTAGGAACCT TCCTGACACA TTCAATGAAG CCAAGTTCCT AACTTTCAGC 2220 

ATGCTGGTGT TCTGCAGTGT CTGGATCACC TTCCTCCCTG TCTACCACAG CACCAGGGGG 2280 

AAGGT CATGG TGGTTGTGGA GGTTTTCTCC ATCTTGGCTT CTAGTGCAGG GTTGCTAATG 2340 

TGTATCTTTG TCCCAAAGTG TTATGTTATT TTAATTAGAC CAGATTCAAA TTTTATACAG 2400 

AACCACAAAG GTAAATTGCT TTAT 2424 

(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2409 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 
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CATTTTTACC TTGGGGCAGT TGATAAACCA ATTGAAGATA ATTTTTATAA TTCACTTTTA 60 

AAGTTTAGAA TTGCAGCAAG TGAATATGAG TTTCTTCTGG TAATG TTTTT TGCTACTGAT 120 

GAGATCAACA AGAATCCTTA TCTTTTACCC AACATAACTT TGATGTTCTC CATCATTGGT 180 

GGAAACTGTC ATGATTTATT GAGAGGTTTG GATCAAGCAT ATACACAAAT AAATGGACAT 240 

ATGAATTTTG TTAATTATTT CTGTTATTTA GATGATTCAT GTGCCATAGG TCTTACAGGA 300 

CCATCATGGA AAACATCCTT AAATCTGGCA ATGCATTCTT CAATGCCACT GGTTTTCTTT 360 

GGATCATTTA ATCCTAACCT ACATGACCAT GACCGGCTGC ACCATGTCCA TCAAGTAGCC 420 

ACCAAGGACA CACATTTGTC CCATGGCATT GTCTCCTTGA TGTTTCATTT TAGATGGACT 480 

TGGATAGGAC TGGTCATCTC AGATGATGAC AAGGGTATTC AGTTTCTCTC AGATTTAAGA 540 

GAAGAAAGCC AAAGG CATGG GATCTGTTTA GCTTTTGTTA AT ATGATCC C AGAAAACATG 600 

CAGATATACA TGACAAGGGC TACAATATAT GATAAACAAA TTATGACGTC TTTAGCAAAA 660 

GTTGTTATCA TTTATGGTGA AATGAACTCT ACACTAGAAG TAAG CTTTAG AAGATGGGAA 720 

AATTTAGGTG CTCGGAGAAT CTGGATCACA ACCTCACAAT GGGATGTCAT CACAAATAAA 780 

AAAGAATTCA CCCTTAATCT CTTCCATGGG ACTATTACTT TTGCACACCG CAGATTTGAG 840 

ATTC CTAAAT TTAAAAAATT TATGCAAACA ATGAACACTG CCAAATACCC AGTAGATATT 900 

TCTCATACTA TATTGGAGTG GAATTATTTT AATTGTTCAA TCTCTAAGAA CAGCAGTAAA 960 

ATGGATCATA TTACATTCAA CAACACATTG GAATGGACAG CACTGCACAA CTATGATATG 1020 

GTGATGAGTG ATGAAGGTTA CAATTTGTAT AATGCTGTTT ATGCTGTGGC CCACACCTAC 1080 

CATGAACATA TTTTT CAACA AGTAGAGTCT CAGAAAAAGG CAAAACCCAA AAGATTTTTC 1140 

ACTGTTTGTC AGCAGGTGTC TTCCTTGATG AAAACCAGGG TATTTACTAA CCCTGTTGGA 1200 

GAACTGGTGA ACATGAAGCA TAGGGAAAAT CAGTGTACAG AGTATGACAT TTTCCTCATT 1260 

TGGAACTTTC CACAAGGCCT TGGATTAAAA GTGAAAATAG GAAGCTATTT ACCTTGTTTT 1320 

CCACAGAGAC AAGAACTTCA TATATCTGAT GATTTGGAAT GGGCCATGGG AGGAACATCA 1380 

GTGGTTCCCT CCTCTGTGTG TAGTGTGGCA TGTACTGCAG GATTCAGGAA AATTCATCAG 1440 

AAAGAAACAG CAGACTGCTG CTTTGATTGT GTTCAGTGCC CAGAAAATGA GGTTTCCAAT 1500 

GAAACAGATA TGGAACAGTG TGTGAAGTGT CCATATGATA AGTATGCCAA CATAGAGAAA 1560 

ACCCACTGCC TCTCAAGAGC TGTATCATTT CTGGCTTATG AAGATCCATT GGGGATAGCT 1620 

CTAGGCTGCA TAGCACTGTC CTTCTCAGCC ATCACAATTC TAGTACTAAT CACATTTTTG 1680 

AAGTACAAGG ATACTCCCAT TGTGAAGGCC AATAACCGCA TTCTCAGCTA CATCCTGCTC 1740 

ATCTCTCTAG TCTTCTGCTT TCTCTGCTCC CTGCTCTTCA TTGGACATCC AAACCAGGTC 1800 

TCCTGCGTCT TGCAGCAGAC CACATTTGGA GTATTTTTCA CTGTGTCTGT TTCTACAGTG 1860 

TTGGCCAAAA CAATAACTGT GGTCATGGCT TTCAAGCTCA CTACTCCAGG AAGAAGAATG 1920 

AGAGAGATGT TGGTAACAGG GGCACCTAAG TTGGTCATTC CCATTTGTAC CCTAATCCAA 1980 

TTTGTTCTCT GTGGAATCTG GTTGATAACA TCTCCTCCAT TTATTGACAG AGATATACAA 204 0 

TCTGAGCATG GGAAGATTGT CATTCTTTGC AATAAAGGCT CTGTCATTGC CTTCCATGTT 2100 

GTCCTGGGAT ACTTGGGCTC CTTGGCTCTG GGGAGCTTCA CTTTGGCTTT CTTGGCTAGG 2160 

AACCTTCCTG ACACATTCAA TGAAGCCAAA TTCCTGACTT TGAGCATGCT GGTGTTCTGC 2220 

AGTGTCTGGA TCACCTTTCT CCCTGTCTAC CATAGCACCA GGGGGAAGGT CATGGTGGTT 2280 

GTGGAGGTTT TCTCAATCTT GGCTTCTAGT GCAGGGTTGC TAATGTGTAT CTTTGTCCCA 2340 

AAGTGTTATG TT ATTTT AG T TAGACCAGAT TCAAATTTTA TACGGAAGTA CAAAGATAAA 2400 

TTTCGTTAT 2409 

(2) INFORMATION FOR SEQ ID NO: 71: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2556 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 



ATGTTCATTT TCATGGGAGT CTTCTTCCTA CTTAATATTA CACTTCTCAT GGCCAATTTC 60 

ATTGATCCCA GGTGCTTTTG GAGAATAAAT TTGGATGAAA TAACGGATGA ATATTTGGGA 120 

TTATCTTGTG CTTTCATCCT GGCAGCTGTT CAGACACCCA TTGAAAAAGA TTATTTCAAC 180 

ACGACTCTTA ATTTTCTAAA AACTACTAAA AACCACAAAT ATGCTTTGGC ATTGGTGTTT 240 

GCAATGGATG AAATCAACAG ATATCCTGAT CTTTTACCAA ATATGTCTTT GATTATCAGA 300 

TACTCTTTGG GCCATTGTGA TGGAAAAACT GTAACACCTA CACCATATTT ATTTCATAGA 360 

A AAA AGCAAA GCCCTATTCC TAATTATTTC TGTAATGAAG AGAGTATGTG TTCATTTCTG 420 

CTTTCAGGAC CCAATTGGGA TGAATCTTTA AGTTTCTGGA AGTACCTGGA CAGCTTCTTA 480 

TCTCCACGTA TCCTTCAGCT TTCCTATGGA TCTTTCAGTT CCATCTTCAG TGATGATGAA 540 

CAATATCCCT ATCTCTATCA GATGGCCCCA AAAGACACAT CTCTAGCATT GGCAATGGTC 600 

TCCTTCATAC TTTATTTGAA ATGGAATTGG ATTGGCCTTG TCATCCCAGA TGATGATCAA 660 
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GGAAACCAAT TTCTTTTAGA GTTGAAGAAA CAGAGTGAAA ACAAAGAAAT TTGCTTTGCC 720 

TTTGTGAAAA TGATCTCTGT TGATGAAGTT TCATTTCCAC AAAAAACTGA AATAAACTAC 780 

AAACAAATTG TGAAGTCACT AACAAATGTT ATTATCATTT ATGGAGAAAC ATATAATTTC 840 

ATTGATTTGA TCTTCAGAAT GTGGGAACCT CCCATTTTAC AGAGAATATG GATCACCACA 900 

AAACAATTGA ATTTCC CTAC CAGTAAGACA GACATAAGTC ATGACACATT CTATGGATCA 960 

CTTACTTTTC TACCCCACCA TGGTGAGATT TCTGGCTTTA AAAATTTTGT ACAGACATGG 1020 

TTCCATCTCA GAAACACAGA TTTATGTCTA GTAATGCCAG AGTGGAAATA TATTAACTCT 1080 

GAAGACTCAG CATCTAATTG TAAAATACTT AAGAACAGTT CATCTGATGC CTCATTTGAT 1140 

TGGCTAATGG AAGAGAAGCT TGACATGGCC TTTAGTGAGA ATAGTCATAA CATATATAAT 1200 

GCTGTGCATG CCATAGCCCA TGCCCTCCAT GAGATGAATC TGCAACAGGC TGATAATCAG 1260 

GCAATAGATA ATGGAAAAGG AGCCAGTTCT CACTGCTTGA AGGTAAACTC CTTTCTAAGA 1320 

AGGACCTACT TCACTAATCC TCTTGGGGAC AAAGTGTTTA TGAAGCAAAG AGTAATAATG 1380 

CAGGATGAAT ATGACATTGT TCACTTTGCG AATCTCTCAC AACACCTTGG GATTAAGATG 1440 

AAGTTAGGAA AGTTCAGCCC ATATTTACCA CATGGTCGAC ACTCTCACTT ATACGTAGAC 1500 

ATGATTGAGT TGGCCACAGG AAGAAGAAAG ATGCCATCCT CTGTGTGCAG TGCAGATTGT 1560 

AGTCCTGGAT TCAGAAGATT ATGGAAGGAG GGAATGGCAG CCTGCTGTTT TGTTTGCAGC 1620 

CCCTGCCCTG AAAATGAAAT TTCTAATGAG ACAAATATGG ATCAATGCGT GAATTGTCCA 1680 

GAATACCAAT ATGCCAACAC AGAACAGAAC AAATGTATTC AGAAAGGTGT CACCTTCCTA 1740 

AGCTATGAAG ACCCCTTGGG GATGGCACTT GCCTTAATGG CCTTCTGCTT CTCTGCATTC 1800 

ACAGCTGTGG TACTTTGTGT CTTTGTGAAG CACCATOACA CTCCTATTGT GAAGGCCAAT 1860 

AACAGAAGCC TCAGCTATCT ATTACTCATG TCACTCATGT TCTGTTTTCT GTGCTCCTTT 1920 

TTCTTCATTG GCCTTCCAAA CAAAGTCATC TGTGTCTTAC AGCAAATCAC ATTTGGAATT 1980 

GTATTCACTG TGGCTGTTTC CACAGTTCTG GCCAAAACAG TCACTGTGGT TCTAGCTTTC 2040 

AAAGTCACAG TCCCAGGAAG AAGATTGAGA TACTTCCTTG TATCAGGGAC ACTAAACTAC 2100 

ATTATTC CTA TATGTTCCCT ACTCCAATGT GTTCTGTGTG CAATCTGGCT AGCAGTCTCT 2160 

CCTCCCTTTG TTGATATTGA TGAACACTCT CAGCATGGCC ACATCATCAT TGTGTGCAAC 2220 

AAGGGCTCAG TTACTGCATT CTACTGTGTC CTTGGATACT TGGCCTGCCT GGCACTGGGA 2280 

AGCTTCACTT TGGCTTTCTT GGCCAAGAAT CTGCCTGATG CATTCAATGA AGCCAAGTTC 2340 

TTGACCTTCA GCATGCTAGT GTTCTGCAGT GTCTGGGTCA CCTTCCTCCC TGTGTACCAT 2400 

AGCACAAAGG GCAAACACAT GGTTGCTGTG GAGATCTTCT CTATCTTGGC ATC CAGTGCA 2460 

GGGATGCTTG GATGTATTTT TGTACCCAAG ATTTATATCA TTTTAATGAG ACCAGAGAGA 2520 

AATTCTACCC AAAAGATCAG AGAAAAATCA TATTTT 2556 

(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 2169 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

ATCTGTAATG AAGAGAGTAT GTGTTCATTT CTGCTTTCAG GACCCAATTG GGATGAATCT 60 

TTAAGTTTCT GGAAGTACCT GGACAGCTTC TTATCTCCAC ATATCCTTCA GCTTTCCTAT 120 

GGATCTTTCA GTTCCATCTT CAGTGATGAT GAACAATATC CCTATCTCTA TCAGATGGCC 180 

CCAAAGGACA CATCTCTAGC ATTGGCAATG GTCTCCTTCA TACTTTATTT GAAATGGAAT 240 

TGGATTGGCC TTGTCATCCC AGATGACGAT CAAGGAAACC AATTTCTTTT AGAGTTGAAG 300 

AAACAGAGTG AAAACAAAGA AATTTGCTTT GCCTTTGTGA AAATGATATC TGTTGATGAA 360 

GTTTCATTTC CACAAAAAAC TGAAATATAC TACAAACAAA TTGTGAAGTC ATTAACAAAT 420 

GTTATTATCA TTTATGGAGA AACATATAAT TTCATTGATT TGATCTTCAG AATGTGGGAA 480 

CCTCCCATTT TACAGAGAAT ATGGATCACC ACAAAACAAT TGAATTTCCC TACCAGTAAG 540 

ACAGACATAA GTCATGACAC ATTCTATGGA TCACTTACTT TTCTACCCCA CCATGGTGAG 600 

ATTTCTGGCT TTAAAAATTT TGTACAGACA TGGTTCCATC TCAGAAACAC AGATTTATAT 660 

CTAGTAATGC CAGAGTGGAA ATATATTAAC TCTGAAGACT CAGCATCTAA TTGTAAAATA 720 

CTGAAGAACA GTTCATCTGA TGCCTCATTT GATTGGCTAA TGGAACAGAA GCTTGACATG 780 

GCCTTTAGTG ATAATAGTCA TAACATATAT AATGTTGTGC ATGCCATAGC CCATGCCCTC 840 

CATGAGATGA ATCTGCAACA GGCTGATAAT CAGGCAATAG ATAATGGAAA AGGAGCCAGT 900 

TCTCACTGCT TGAAGGTAAA CTCCTTTCTA AGAAGGACCT ACTTCACTAA TCCTCTTGGG 960 

GACAAAGTGT TTATGAAGCA AAGAGTAATA ATGCAGGATG AATATGACAT TGTTCACTTT 1020 

GCGAATCTCT CACAACACCT TGGGATTAAG ATGAAGTTAG GAAAGTTCAG CCCATATTTA 1080 

CCACATGGTC GACACTCTCA CTTATACGTA GACATGATTG AGTTGGCCAC AGGAAGAAGA 1140 

AAGATGCCAT CCTCTGTGTG CAGTGCAGAT TGTAGTCCTG GATTCAGAAG ATTATGGAAG 1200 



WO 99/00422 



PCT/US98/13680 



-181- 

GAGGGAATGG CAGCCTGCTG TTTTGTTTGC AGCCCCTGCC CTGAAAATGA AATTTCTAAT 1260 

GAGACAAATA TGGATCAATG CGTGAATTGT CCAGAATACC AATATGCCAA CACAGAACAG 1320 

AACAAATGTA TTCAGAAAGG TGTCACCTTC CTAAGCTATG AAGACCCCTT GGGGATGGCA 1380 

CTTGCCTTAA TGGCCTTCTG CTTCTCTGCA TTCACAGCTG TGGTACTTTG TGTCTTTGTG 1440 

AAGCACCATG ACACTCCTAT TGTGAAGGCC AATAACAGAA GCCTCAGCTA TCTATTACTC 1500 

ATGTCACTCA TGTTCTGTTT TCTGTGCTCC TTTTTCTTCA TTGGCCTTCC AAACAAAGTC 1560 

ATCTGTGTCT TACAGCAGAT CACATTTGGA ATTGTATTTA CTGTAGCTGT TTCCACAGTT 1620 

CTGGCCAAAA CAGTCACTGT GGTTCTAGCT TTCAAAGTCA CAGACCCAGG AAGAAGATTG 1680 

AGATACTTCC TTGTATCAGG GACACTAAAC TACATTATTC CTATATGTTC CCTACTCCAA . 1740 

TGTGTTCTGT GTGCAATCTG GCTAGCAGTC TCTCCTCCCT TTGTTGATAT TGATGAACAC 1800 

TCTCAGCATG GCCACATCAT CATTGTGTGC AACAAGGGCT CAGTTACTGC ATTCTACTGT 1860 

GTCCTTGGAT ACTTGGCCTG CCTGGCACTG GGAAGCTTCA CTTTGGCTTT CTTGGCCAAG 1920 

AATCTGCCTG ATGCATTCAA TGAAGCCAAG TTCTTGACCT TCAGCATGCT AGTGTTCTGC 1980 

AGTGTCTGGG TCACCTTCCT CCCTGTGTAC CATAGCACAA AGGGCAAACA CATGGTTGCT 2040 

GTGGAGATCT TCTCCATCTT GGCATCCAGT GCAGGGATGC TTGAATGTAT TTTTGTACCC 2100 

AAGATTTATA TCATTTTAAT GAGACCAGAG AGAAATTCTA CCCAAAAGAT CAGGGAAAAA 2160 

TCATATTTC 2169 

(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1889 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 

GAATTCGGCT TCTGCACCAA ATGGCGACGA AAGACACATC TCTTTCACTT GCCATTGTTT 60 

CTTTGATGGT TCATTTTAGG TGGTCTTGGG TTGGTCTAAT TCTCCCAGAT GACCACAAAG 120 

GAAATAAAAT ACTATCAGAT TTTAGAAAGG AGATGGAAAG AAAAAGAATC TGTACGGCTT 180 

TTGTAAAAAT GATTCCTGCC ACATGGACTT CATCTTTTGT CAAATTCTGG GAAAATATGG 240 

ATGACACCAA CATAATAATT ATTTATGGTG ACATTGATTC TCTAGAAGGT CTAATGCGAA 300 

ATATTGGGCA AAGGTTATTG ACATGGCATG TCTGGGTCAT GAACATTGAA CCCCATATTA 360 

TTGAATATGA TAATTATTTC ATGTTAGATT CATTCCATGG AAGTTTAATT TTTAAGCACA 420 

ATTATAGAGA GAATTTTGAG TTTACCAAAT TTATTCGAAC AGTTAATCCT AAAAAATACC 480 

CAGAAGACAT TTATCTCCCT AAGATGTGGT ATTTGTTCTT CATGTGCTCA TTTTCTGATA 540 

TTAATTGTCA AGTTTTGGAC AGCTGTCAAA CAAATGCTTC TTTGGATATG TTACCTAGTC 600 

AGATATTTGA TGTGGTCATG AGTGAAGAGA GCACAAGTAT TTACAATGCT GTGTACGCTG 660 

TGGCTCACAG CCTCCATGAG ATGAGACTTC AGCAACTTCA AACACAACCG TGTGAAAATG 720 

AAGAAGGGAT GGAGTTCTTT CCATGGCAGC TTAATACTTT CCTGAAGGAT ATTGAGGTGA 780 

GAGTCAACAG TTTAGACTGG AGACAGAGAA TAGATGCTGA ATATGACATT CTTAACCTCT 840 

GGAATTTACC AAAGGGTCTT GGACTAAAAG TGAAAATAGG AAACTTTTAT GCAAATGCTC 900 

CCCAGGGTCA ACAATTGTCT TTATCTGAAC AGATGATTCA ATGGCCAGAA ATATTTTCAG 960 

AGATC CCTCA GTCGGTGTGC AGTGAGAGTT GTGGGCCTGG ATTCAGGAAA GTAACCCTGG 1020 

AGAATAAGGC TATCTGCTGC TACAATTGTA CTCCCTGTGC AGACAATGAG ATTTCTAATG 1080 

AGACAGATGT AGACCAGTGT GTGAAGTGTC CAGAGAGTCA TTATGCAAAT ACAGAGAAGA 1140 

GCAACTGCTA TCAAAAGTCT GTGAGCTTTC TGGGCTATGA AGACCCTTTG GGGATGGCTC 1200 

TAGCCAGCAT AGCTTTGTGC TTGTCTGCAC TAACTGCCTT TGTTATTGGC ATATTTGTGA 1260 

AACACAAAGA CACTCCTATT GTTAAGGCCA ATAATCAAGC TCTGAGTTAC ACTTTGCTCA 1320 

TCACACTCAA ATTCTGTTTC CTATGTTCTT TGAACTTCAT TGGTCAGCCC AACACAGTTG 1380 

CCTGCATCCT TCAGCAGACC ACCTTTGCAG TTGCTTTCAC TATGGCTCTT GCCACTGTGT 1440 

TGGCCAAAGC TATCACTGTG GTTCTTGCCT TTAAGGTCAG TTTTCCAGGG AGAATGGTAA 1500 

GATGGCTAAT GATATCAAGG GGTCCAAACT ATATCATTCC TATCTGCACC CTGATCCAAC 1560 

TTCTTCTTTG TGGAATATGG ATGGCAATAT CTCCACCATA CATTGACCAA GATG CTCATA 1620 

TTGAACATGG TCACATCATC ATTTTGTGCA ACAAGGGCTC AGCTGTTGCC TTCCACTCTG 1680 

TCCTGGGATA CCTCTGCTTC TTGGCCCTTG GGAGTTATAC CATGGCCTTC TTGTCAAGAA 1740 

ATTTGCCTGA TACATTCAAC GAATCCAAAT TTATCTCACT AAGTATGCTG GTATTCTTCT 1800 

GTGTCTGGAT CACCTTTCTT CCTGTCTACC ACAGCACTAA AGGGAAGGTC ATGGTCGCCG 1860 

TCGAGGTCTT TTGCATCCAA GCCGAATTC 1889 



(2) INFORMATION FOR SEQ ID NO: 74: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1889 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 



GAATTCGGCT TCTGCATCAA ATGGCGACGA AGGACACATC TCTTTCACTT GCCATTGTTT 60 

CTTTGATGGT TCATTTTAGG TGGTCTTGGG TTGGTCTAAT TCTCCCAGAT GACCACAAAG 120 

GAAATAAAAT ACTATCAGAT TTTAGAAAGG AGATGGAGAG AAAAAGAATC TGTACGGCTT 180 

TTGTAAAAAT GATTCCTGCC ACATGGACTT CATCTTTTGT CAAATTCTGG GAAAATATGG 240 

ATGACACCAA CATAATAATT ATTTATGGTG ACATTGATTC TCTAGAAGGT CCAATGCGAA 300 

ATATTGGGCA AAGGTTATTG ACATGGCATG TCTGGGTCAT GAACATTGAA CCCCATATTA 360 

TTGAATATGA TAATTATTTC ATGTTAGATT CATTCCATGG AAGTTTAATT TTTAAGCACA 420 

ATTATAGAGA GAATTTTGAG TTTACCAAAT TTATTCGAAC AGTTAATCCT AAAAAATACC 480 

CAGAAGACAT TTATCTCCCT AAGATGTGGT ATTTGTTCTT CATGTGCTCA TTTTCTGATA 540 

TTAATTGTCA AGTTTTGGAC AGCTGTCAAA CAAATGCTTC TTTGGATATG TTACCTAGTC 600 

AGATATTTGA TGTGGTCATG AGTGAAGAGA GCACAAGTAT TTACAATGCT GTGTACGCTG 660 

TGGCTCACAG CCTCCATGAG ATGAGACTTC AGCAACTTCA AACACAACCG TGTGAAAATG 720 

AAGAAGGGAT GGAGTTCTTT CCATGGCAGC TTAATACTTT CCTGAAGGAT ATTGAGGTGA 780 

GAGTCAACAG TTTGGACTGG AGACAGAGAA TAGATGCTGA ATATGACATT CTTAACCTCT 840 

GGAATTTACC AAAGGGTCTT GGACTAAAAG TGAAAATAGG AAACTTTTAT GCAAATGCTC 900 

CCCAGGGTCA ACAATTGTCT TTATCTGAAC AGATGATTCA ATGG CCAGAA ATATTTTCAG 960 

AAGTCCCTCA GTCTGTGTGC AGTGAGAGTT GTAGGCCTGG ATTCAGGAAA GTATCCCTGG 1020 

ATGATAAGGC CATCTGCTGC TACAAGTGCA CTCCTTGTGC CGACAATGAG ATATCTAATG 1080 

AGACAGATGT AGACCAGTGT GTGAAGTGTC CAGAGAGTCA TTATGCAAAT ACAGAGAAGA 1140 

GCAACTGCTT CCCAAAATCT GTGAGCTTTC TGGCCTATGA AGACCCCTTG GGGATGGCTC 1200 

TAGCCAGCAT AGCTTTGTGC TTATCTGCAC TCACTGTCTT TGTTATTGGC ATCTTTGTGA 1260 

AAAACAGAGA CACTCCTATT GTCAAGGCCA ATAATCGGAC TCTAAGTTAC ATTTTGCTCA 1320 

TCACACTCAC CTTTTGTTTC TTATGTTCTT TGAACTTCAT TGGTCAGCCC AACACAGCTG 1380 

CCTGCATCCT TCAGCAGACC ACCTTTGCAG TTGCTTTCAC TATGGCTCTT GCCACTGTGT 1440 

TGGCCAAAGC TATTACTGTA GTCCTTGCCT TTAAGATCAG TTTTCCAGGG AGAATGTTAA 1500 

GGTGGCTAAT GATATCAAGG GGTCCAAGAT ACATCATTCC TATCTGCACA CTGATCCAGC 1560 

TTCTTCTTTG TGGAATATGG ATGGCAACTT CTCCACCATT CATTGACCAA GATGTTAATA 1620 

CTGAAGATGG ATACATCATC CTTTTGTGCA ACAAGGGCTC AGCTGTTGCC TTCCATTCAG 1680 

TCCTGGGATA CCTCTGTTTC TTGGCCCTTG GGAGTTATAC CATGGCCTTC TTGTCTAGAA 1740 

ATTTGCCTGA TACATTCAAT GAATCCAAAT TTCTGTCATT CAGTATGCTG GTGTTCTTCT 1800 

GTGTCTGGGT CACCTTTCTT CCTGTCTACC ACAGCACTAA AGGGAAAGTT ATGGTCGTCG 1860 

TCGAAGTCTT CTGCATCCAA GCCGAATTC 1889 



(2) INFORMATION FOR SEQ ID NO: 75: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 270 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 



ATGAAGAAGC TCTGTGCTTT CACGATTTCA TTGTTGTTTC TGAAGTTTTC TCTCATCTTG 60 

TGCTGTTGGA GTGAACCAAG TTGCTTTTGG AGGATAAAGA ATAGTGATGA TAATGACGGA 120 

GATTTGCAAA GGGAATGTCA TTTTTACCTT GGGGCAGCTG ATACACCAGT TGAAGATAAT 180 

TTTTATAGTT CACTTTTAAA ATTTAGGTTT TCTTTGGACC ATTTAATCCT AACCTACGCG 240 

ACCATGACCG GCTGCCCCAT GTCCATCAGG 270 



(2) INFORMATION FOR SEQ . ID NO: 76: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1308 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: CDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 



ATGAAGAAGC TCTGTGCTTT CACGATTTCA TTGTTGTTTC TGAAGTTTTC TCTCATCTTG 60 

TGCTGTTGGA GTGAACCAAG TTGCTTTTGG AGGATAAAGA ATAGTGATGA TAATGACGGA 120 

GATTTGCAAA GGGAATGTCA TTTTTACCTT GGGGCAGCTG ATACACCAGT TGAAGATAAT 180 

TTTTATAGTT CACTTTTAAA ATTTAGAATT GCAGCAAGTG AATATGAGTT TCTTCTCGTA 240 

ATGTTTTTTG CTATCGATGA GATCAACAGG AATCCTTATC TTTTACCCAA CATAACTTTG 300 

ATGTTCTCCT TCATTGGTGG AAACTGTCAG GATTTATTGA GAGTTATGGA CCAAGCATAT 360 

ACACAAATAA ATGGACATAT GAATTTTGTT AATTATTTCT GTTATTTAGA TGATTCATGT 420 

GCCATAGGTC TTACAGGACC ATCATGGAAA ACTTCCTTAA AACTGGCAAT GCACTCTTCG 480 

ATGCCACTGG TTTTCTTTGG ACCATTTAAT CCTAACCTAC GCGACCATGA CCGGCTGCCC 540 

CATGTCCATC AGGTAGCCCC CAAGGACACA CATTTGTCCC ATGGCATGGT CTCCTTGATG 600 

TTTCACTTTA GATGGACTTG GAT AGGAATG GTCATCTCAG ATGATGACCA GGGTATTCAG 660 

TTTGTCTGAG ATTTAAGAGA AGAAAGCCAA AGGCATGGGA TCTGTTTAGC TTTTGTTAAT 720 

ATGATCCCAG AAAACATGCA GATATACATG ACAAGGGCTA CAATATATGA TCAACAAATT 780 

ATGACATCTT CAGCAAAGGT TGTTATCATT TATGGTGAAA TGAACTCTAC TCTAGAAGTA 840 

AGCTTTAGAA GATGGGAAGA GTTAGGTGCT CGGAGAATCT GGATCACAAC CTCACAATGG 900 

GATGTCATCA CAAATAAAAA AGACTTCACC CTTAATCTCT TCCATGGGAC TATCACTTTT 960 

GCACACCACA GAGTTGAGAT TCCTAAATTA AATAAATTCA TGCAAACAAT GAACACTGCC 1020 

AAATACCCAG TAGATATTTC TCATACTATA TTGGAGTGGA ATTATTTTAA TTGTTCAATA 1080 

TCTAAGAACA GCATTAGAAT GCATCATATT ACATTCAACA ACACCTTGGA ATGGACATCA 1140 

CTGCACAACT ATGATATGGC GATGAGTGAT GAAGGTTACA GTTTATATAA TGCTGTTTAT 1200 

GCTGTGGCCC ACACCTACCA TGAATACATT TTTCAACAAG TAGAGTCTCA GAAAAAGGCA 1260 

AAACCCAAAA GATATTTCAC TGCTTGTCAG CAGATATGGA ACAGTGTG 1308 



(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1296 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 



ATGAAGAAGC TCTGTGCTTT CACTATTTCA TTTTTGTCTC TGAAGTTTTC TCTCATCTTG 60 

TGCTGTTTGA CTGAAGCAAG TTGCTTTTGG AGGATAAAGA ATAGTGAAGA TAGTGATGGA 120 

GATTTGCAAA GAGAATGTCA TTTTTACCTT TGGGTAATTG ATAAACCTAT TGAAGATAAT 180 

TTTTATAATT CAGTTTTAAA TTTTAGAATA TCAGCAAGTG AATATGAGTT TCTTCTGGTA 240 

ATGTTTTTTG CTACTGATGA GATCAACAAG AATCCTTATC TTTTACCCAA CATAACTTTG 300 

ATATTCAGCA TCGTTGGTGG TCACTGTCAT GATTTATTGA GAGGTCTGGA TCAATCATAT 360 

ACACAAATAA ATGGACGTGT GAATTTTGTT AATTATTTCT GTTATTTAGA TGATTCATGT 420 

AACATAGGCC TTACAGGACC ATCATGGAAA AAATCCTTAA AACTGGCAAT GGATTCTTCA 480 

ATACCAATGG TTTTCTTTGG ACCATTTAAT CCTAACCTAC GCGACCATGA CCGGCTGCCC 540 

CAT GT CCAT C AGGTAGCCCC CAAGGACACA CATTTATCCC ATGGCATGGT CTCCTTGATG 600 

TTT CATTTTA GATGGACTTG GATAGGACTG GTCATCTCAG ATGATGACCA GGGTATTCAG 660 

TTTCTCTCAG ATTTAAGAGA AGAAAGCCAA AGGCATGGGA TCTGTTTAGC TTTTGTTAAT 720 

ATGATCCCAG AAAACATGCA GATATACATG ACAAGGGCTA CAATATATGA TAAACAAATT 780 

ATGACATCTT CAGCAAAGGT TGTTATCATT TATGGTGAAA TGAACTCTAC TCTAGAAGTA 840 

AGCTTCAGAA GATGGGAAGA TTTAGGTGCT CGGAGAATCT GGATCACAAC CTCACAATGG 900 

GATATCATAT TAAATAAAAA AGAATTCACT CTTAATCTCT TCCATGGCCC TATCACTTTT 960 

GCACACCACA AAGTTGAGAT TCCTAAATTA AGGAATTTTA TGCAAACAAT GAACACTGCC 1020 

AAATACCCAG TAGATATTTC TCATACTATA CTGGAGTGGA ATTATTTTAA TTGTTCAATC 1080 

TCTAAGAACA GCAGTAAAAT GGATCTTTTT ACATCCAACA ACACATTGGA ATGGACAGCA 1140 

CTGCACAACT ATGATATGGC CATGAGTGAT GAAGGTTACA ATTTGTATAA TGCTGTTTAT 1200 

GTTGCGGCCC ACACCTACCA TGAACACATT CTTCAACAAG TAGAGTCTCA GAAAAAGGTA 1260 

GAACACAACA GATATTTCAC TGTTTGTCAG CAGATA 1296 
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(2) INFORMATION FOR SEQ ID NO: 78: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1521 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76 : 

ATGAAGAAGC TCTGTGCTTT CACTATTTCA TTTTTGTCTC TGAAGTTTTC TCTCATCTTG 60 

TGCTGTTTGA CTGAAGCAAG TTGCTTTTGG AGGATAAAGA ATAGTGAAGA TAGTGATGGA 120 

GATTTGCAAA GAGAATGTCA TTTTTACCTT TGGGTAATTG ATAAACCTAT TGAAGATAAT 180 

TTTTATAATT CAGTTTTAAA TTTTAGAATA TCAGCAAGTG AATATGAGTT TCTTCTGGTA 240 

ATGTTTTTTG CTACTGATGA GATCAACAAG AATCCTTATC TTTTACCCAA CATAACTTTG 300 

ATATTCAGCA TCGTTGGTGG TCACTGTCAT GATTTATTGA GAGGTCTGGA TCAATCATAT 360 

ACACAAATAA ATGGACGTGT GAATTTTGTT AATTATTTCT GTTATTTAGA TGATTCATGT 420 

AACATAGGCC TTACAGGACC ATCATGGAAA AAATCCTTAA AACTGGCAAT GGATTCTTCA 480 

ATACCAATGG TTTTCTTTGG ACCATTTAAT CCTAACCTAC GCGACCATGA CCGGCTGCCC 540 

CATGTCCATC AGGTAGCCCC CAAGGACACA CATTTATCCC ATGGCATGGT CTCCTTGATG 600 

TTTCATTTTA GATGGACTTG GATAGGACTG GTCATCTCAG ATGATGACCA GGGTATTCAG 660 

TTTCTCTCAG ATTTAAGAGA AGAAAGCCAA AGGCATGGGA TCTGTTTAGC TTTTGTTAAT 720 

ATGATCCCAG AAAACATGCA GATATACATG ACAAGGGCTA CAATATATGA TAAACAAATT 780 

ATGACATCTT CAGCAAAGGT TGTTATCATT TATGGTGAAA TGAACTCTAC TCTAGAAGTA 840 

AGCTTCAGAA GATGGGAAGA TTTAGGTGCT CGGAGAATCT GGATCACAAC CTCACAATGG 900 

GATATCATAT TAAATAAAAA AGAATTCACT CTTAATCTCT TCCATGGCCC TATCACTTTT 960 

GCACACCACA AAGTTGAGAT TCCTAAATTA AGGAATTTTA TGCAAACAAT GAACACTGCC 1020 

AAATACCCAG TAGATATTTC TCATACTATA CTGGAGTGGA ATTATTTTAA TTGTTCAATC 1080 

TCTAAGAACA GCAGTAAAAT GGATCTTTTT ACATCCAACA ACACATTGGA ATGGACAGCA 1140 

CTGCACAACT ATGATATGGC CATGAGTGAT GAAGGTTACA ATTTGTATAA TGCTGTTTAT 1200 

GTTGCGGCCC ACACCTACCA TGAACACATT CTTCAACAAG TAGAGTCTCA GAAAAAGGTA 1260 

GAACACAACA GATATTTCAC TGTTTGTCAG CAGGTATCTT CCTTGATGAA AACCAGGGTA 1320 

TTTACGA ACC CGGTTGGAGA ACTGGTGAAC ATGAAGCATA GGGAAAATCA GTGTACAGAG 1380 

TATGATATTT TCATCATTTG GAATTTTCCA CAAGGCCTTG GATTAAAATT GAAAATAGGA 1440 

AGCTATATAC CTTGTTTTCC AAAGAGTCAA CAACTTCATA TATCTGATGA TTTGGAATGG 1500 

GCCATGGGAG GAACATCAAT A 1521 

(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 933 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 

ATGAAGCAGC TCTGCACTTT CACTATTTCA TTGTTGTTTC TGAAGTTTTC TCTCATCTTG 60 

TGCTGTTGGA GTGAACCAAG CTGCTTTTGG AGGATAAAGA AGAGTGAAGA TAATGATGGA 120 

GATTT ACAAA GGGAGTGTCA TTTTTACCTT TGGAAAACTG ATGAACCTAT TGAAGATAGT 180 

TTTTATAATT ATGATTTAAG TTTTAGAATT GCAGGAAGTG AATATGAGCT TCTTCTGGTA 240 

ATGTTTTTTG CTACTGATGA GATCAACAAG AATCCTTATC TTTTACCCAA CATGAGTTTG 300 

ATGTTCTCCA TCATTGGTGG AAACTGTCAT GATTTATTGA GAAGTCTGGA TCAAGAATAT 360 

GCACAAATAG ATGGACATAT GAATTTTGTT AATTATTTCT GTTATTTAGA TGATTCATGT 420 

GCCACAGGCC TTACAGGACC ATCATGGAAA ACATCCTTAA AACTGGCAAT GCATTCTTCA 480 

ATGCCACTGG TTTTCTTTGG ACCATTTAAT CCTAACCTAC GCGACCATGA CCGGCTGCCC 540 

CAT G TCCAT C AGGTAGCCCC CAAGGACACA CATTTGTCCC ATGGCATGGT CTCCTTGATG 600 

TTTCATTTTA GGTGGACTTG GATAGGACTG GTCATCTCAG ATGATGATCA GGGTATTCAG 660 

TTTCTCTCAG ATTTAAGAGA AGAAAGCCAA AGGCATGGGA TCTGTTTGGC TTTTGTTAAT 720 

ATGATCCCAG AAAACATGCA GATATACATG ACAAGGGCTA CAATATATGA TACACAAATT 780 

ATGACATCTT CAGCAAAGGT TGTTATCATT TATGGTGACA TGAACTCTAC TCTAGAAGCA 840 
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AGCTTTAGAA OATGGGAAGA GTTAGGTGCT CGGAGAATCT GGATCACAAC CACACAATGG 900 

GATGTCATCA CAAATAAAAA AAGACTTCAC CCT 933 

(2) INFORMATION FOR SEQ ID NO:80: 

(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1236 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

GCAAGTTGCT TTTGGCGGAT AAAGAATAGT GAAGATAATG ATGGAGATTT GCAAAGGGAA 60 

TGTCATTTTT ACCTTGGGGC AGTTGATAAA CCAATTGAAG ATAATTTTTA TAATTCACTT 120 

TTAAAGTTTA GAATTGCAGC AAGTGAATAT GAGTTTCTTC TGGTAATGTT TTTTGCTACT 180 

GATGAGATCA ACAAGAATCC TTATCTTTTA CCCAACATAA CTTTGATGTT CTCCATCATT 240 

GGTGGAAACT GTGATGATTT ATTGAGAGGT TTGGATCAAG CATATACACA AATAAATGGA 300 

CATATGAATT TTGTTAATTA TTTCTGTTAT TTAGATGATT CATGTGCCAT AGGTCTTACA 360 

GGACCATCAT GGAAAACATC CTTAAAACTG GCAATGCATT CTTCAATGCC ACTGGTTTTC 420 

TTTGGATCAT TTAATCCTAA CCTACATGAC CATGACCGGC TGCACCATGT CCATCAAGTA 480 

GCCACCAAGG ACACACATTT GTCCCATGGC ATTGTCTCCT TGATGTTTCA TTTTAGATGG 540 

ACTTGGATAG GACTGGTCAT CTCAGATGAT GACAAGGGTA TTCAGTTTCT CTCAGATTTA 600 

AGAGAAGAAA GCCAAAGGCA TGGGATCTGT TTAGCTTTTG TTAATATGAT CCCAGAAAAC 660 

ATGCAGATAT ACATGACAAG GGCTACAATA TATGATAAAC AAATTATGAC GTCTTTAGCA 720 

AAAGTTGTTA TCATTTATGG TGAAATGAAC TCTACACTAG AAGTAAGCTT TAGAAGATGG 780 

GAAAATTTAG GTGCTCGGAG AATCTGGATC ACAACCTCAC AATGGGATGT CATCACAAAT 840 

AAAAAAGAAT TCACCCTTAA TCTCTTCCAT GGGACTATTA CTTTTGCACA CCGCAGATTT 900 

GAGATTCCTA AATTTAAAAA ATTTATGCAA ACAATGAACA CTGCCAAATA CCCAGTAGAT 960 

ATTTCTCATA CTATATTGGA GTGGAATTAT TTTAATTGTT CAATCTCTAA GAACAGCAGT 1020 

AAAATGGATC ATATTACATT CAACAACACA TTGGAATGGA CAGCACTGCA CAACTATGAT 1080 

ATGGTGATGA GTGATGAAGG TTACAATTTG TATAATGCTG TTTATGCTGT GGCCCACACC 1140 

TACCATGAAC ATATTTTTCA ACAAGTAGAG TCTCAGAAAA AGGCAAAACC CAAAAGATTT 1200 

TTCACTGTTT GTCAGCAGCA GATATGGAAC AGTGTG 1236 

(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2412 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 

ATGTTCATTT TCATGGAAGT CTTCTTCCTC CTTAATATTA CACTTCTCAT GGCCAATTTC 60 

ATTGATCCCA GGTGCTTTTG GAGAATAAAT TTGGATGAAA TAATGGATGA ATATTTGGGA 120 

TTATCTTGTG CTTTCATCCT GGCAGCAGTT CAGACACCCA TTGAAAATGA TTATTTCAAC 180 

AAGACTCTTA ATGTTCTAAA AACAACTAAA AACCACAAAT ATGCTTTGGC ATTGGTGTTT 240 

GCAATGGATG AAATCAACAG AAATCCTGAT CTTTTACCAA ATATGTCTTT GATTATAAGA 300 

TACACTTTGG GCCGTTGTGA TGGAAAAACT GTAATACCTA CACCATATTT ATTTCGTAAA 360 

AAAAAAGAAA GCCCTATCCC TAATTATTTC TGTAATGAAG AGACTATGTG TTCCTATCTG 420 

CTTACAGGAC CCCATTGGGA GGTATCTTTA GGTTTCTGGA AGCACATGAA CAGCTTCTTA 480 

TCTCCACGTA TCCTTCAGCT TACCTATGGA CCTTTCCACT CCATCTTCAG TGATGATGAA 540 

CAATATCCCT ATCTCTATCA GATGGCCCCA AAGGACACAT CTCTAGCATT GGCAATGGTC 600 

TCCTTCATAC TTTACTTTAG CTGGAACTGG ATTGGCCTTG TCATTCCAGA TGATGACCAA 660 

GGAAACCAAT TTCTTTTAGA GTTGAAGAAA CAGAGTGAAA ACAAGGAAAT TTGCTTTGCC 720 

TTTGTGAAAA TGATCTCTGT TGATGATGTT TCATTTCCAC AAAATACTGA AATGTACTAC 780 

AACCAAATTG TGATGTCATC CACAAATGTT ATTATCATTT ATGGAGAAAC ATACAATTTC 840 

ATTGATTTGA TCTTCAGAAT GTGGGAACCT CCCATTTTAC AGAGAATATG GATCACCACA 900 

AAACAATTGA ATTTCCCTAC CAGGAAAAAA GACATAAGTC ATGGCACATT CTATGGATCA 960 
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CTTACTTTTC TACCCCACCA TGGTGTGATT TCTGGTTTTA AAAATTTTGT ACAGACATGG 1020 

TTCCATCTCA GAAACACAGA TTTATATCTA GTAATGCAAG AGTGGAAATA CTTTAACTAT 1080 

GAAGACTCAG CATCTACCTG TAAAATACTG AAGAACAATT CATCTAATGC CTCATTTGAT 1140 

TGGCTAATGG AACAGAAGTT TGACATGACC TTTAGTGAGA ATAGTCATAA CATATACAAT 1200 

GCTGTGCATG CCATAGCCCA TGCCCTCCAT GAGATGAATC TGCAACAGGC TGATAATCAG 1260 

GCAATAGACA ATGGGAAAAA GGAGCCCAGT TCCTCCCACT GCTTGAAGGT AAACTCCTTT 1320 

CTAAGAAGGA TTTACTTCAC TAATCCTCCT GGGGACAAAG TGTTTATGAA GCAAAGAGTA 1380 

ATAATG CACG ATGAATATGA CATTGTTCAC TTTGTGAATC TCTCACAACA CCTTGGGATT 1440 

AAGATGAAGT TAGGAAAGTT CAGCCCATAT TTACCACATG GTCGACACTC TCACTTATAT 1500 

GTAGACAGGA TTGAGTTGGC CACAGGAAGA AGAAAGATGC CATCCTCTGT GTGCAGTGCT 1560 

GATTGTAGTC CTGGATTCAG AAGATTATGG AAGGAGGGAA TGGCAGCCTG CTGTTTTGTT 1620 

TGCAGCCCCT GCCCTGAAAA TGAAATTTCT AATGAGACAA CTGTGGTACT TTGTGTCTTT 1680 

GTGAAGCATC ATGACACTCC TATTGTGAAG GCCAATAACA GAAGCCTCAG CTACCTATTA 1740 

CTCATGTCAC TCATGTCCTG TTTTCTGTGC TCCTTTTTCT TCATTGGCCT TCCAAACAGA 1800 

GCCATCTGTG TCTTACAGCA AATCACATTT GGAATTGTAT TCACTATGGC TGTTTCCACA 1860 

GTTCTGGCCA AAACAGTCAC TGTGGTTCTG GCTTTCAAAG TCACAGACCC AGGAAGAAGA 1920 

TTGAGAAACT TCCTGGTATC AGGAAGACCC AACTACATTA TTCCCATATG TTCCCTACTC 1980 

CAATGTGTTC TGTGTGCAAT CTGGCTAGCA GTTTCTCCTC CCTTTGTTGA TATTGATGAA 2040 

CACACTCTCC ATGGCCACAT CATCATTGTG TGCAACAAGG GCTCAGTTAC TGCATTCTAC 2100 

TGTATCCTAG GATACTTGGC CTGCCTGGCA CTTGGAAACT TCTCTGTGGC TTTCTTGGCC 2160 

AAGAATCTGC CTGACACATT CAATGAAGCC AAGTTCTTGA CCTTCAGCAT GCTAGTGTTC 2220 

TGTAGTGTCT GGGTCACCTT CCTCCCTGTC TACCATAGCA CCAAGGGCAA ACACATGGTT 2280 

GCTGTGGAGA TCTTCTCCAT CTTGGCATCC AGTGCTGGGA TCCTTGGATG TATATTTGTA 2340 

CCCAAGATTT ATATCATTTT AATGAGACCA GAGAGAAATT CGACCCAAAA GATCAGGGAA 2400 

AAATCATATT TC 2412 



(2) INFORMATION FOR SEQ ID NO: 82: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 381 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNES S : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE : cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 

ATGTTCATTT TCATGGGAGT CTTCTTCCTC CTTAATATTA CACTTCTCAT GGCCAATTTC 60 

ATTAATCCCA GGTGCTTTTG GAGAATAAAT TTGGATGAAA TAACGGATGA ATATTTGGGA 120 

TTATCTTGTA CTTTCATCCT GGCGGCAGTT CAGACACCCA CTGAAAAAGA TTATTTCAAC 180 

AAGACTCTTA ATGTTCTAAA AACAACTAAA AACCACAAAT ATGCTTTGGC ATTGGTGTTT 240 

GCAATGGATG AAATCAACAG AAATCCTGAT CTTTTACCAA ATATGTCTTT GATTATAAGA 300 

TACACTTTGG GCCTTTGTGA TGGAAAAACT GTAACACCTA CACCATATTT ATTTCATAAA 360 

AAAAAAACAA AGCCCTATCC C 381 



(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 228 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: CDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 

ATGAAAAACC TGTGTGTTTT CACTCTTTCC TTTTTCCTCC TGGAGTTTTC TCTGATCTTG 60 

TGCCATTTGA CTGAACCCAT TTGCTTTTGG AGGATAAATA ATAATGAAGA TAATGATGGA 120 

GATTTGAGAA GTGACTGTGG TTTTTTCCTT GCAGCAGTTG AGGGACCTAC TGACGACTCT 180 

TATAATATCT CTGATCTTAG GTTTTCTTTG GACCATTTAA TCCTAAGC 228 



(2) INFORMATION FOR SEQ ID NO: 84: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1644 base pairs 

(B) TYPE : nucleic acid 

(C) STRAND EDNESS : single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 

ATGTTAGAAT TGGCCCATGG CACTCTGACT TTCTCACCCC ATCATGGGGA GATTTCTGAT 60 

TTCACAAATT TTATGCAGGA AGTCACCCCT ATCAAGTACC CAGAAGACAT TTTTCTTCAC 120 

ATCTTGTGGA ACCAGTATTT CAATTGTCCA CTTTTGCATT CTGAGTGTAA AATCTTTGAA 180 

AACTGTATAC CCAATGCCTC TTTGGAATTG TTGCCAGGGG GTGTTTTTGA GCTGGTCATG 240 

ACTGAAGAGA GTTACAATGT GTACAATGCT GTGTATGCAG TGGCCCACAG TCTCCATGAG 300 

AAGGCTCTCC ATCAAGTAGA AATTCAACCA CAGGATAATA AAGATAGGAC TATATTATTT 360 

CCTTGGCAGC TTCACCCTTT TCTGAAGAAC ATTCAGCTGA TAAATTCTGT TGGTGATCGT 420 

GTGATTCTGG ACTGGAAAAA GAAGACGGAT ACAGAGTATG ATATTTCCAA TATTTGGAAT 480 

TTCCCAACAG GTCTTTCCTT ATTAGTGAAA GTGGGTACAT TTGCTCCAAG TGCTCCCAAG 540 

GGGGAACAAC TTTCGATATC TGAACACACA ATTAACTGGC CCATAGGATT TACAGAGATT 600 

CCAAAGTCTG TATGCAGTGA GAGCTGCAGT CCTGGACACA GGAAAGTCAT CCTGGAGAGC 660 

AAGCCTGCCT GTTGCTTTGA CTGCACTCCT TGCCCAGATA AAGAGATTTC CAACGAGACA 72 0 

GATGTGGGTC AGTGTGTGAA GTGTCCTGAA TCTCATTATG CAAATACAGA GAAGAGTCAC 780 

TGCCTGAAGA AGACTATGAC CTTTCTGGAT TATAATGATT CCTTGGGGAC GGGACTCACA 840 

CTCATGTCTC TGGGATTCTT TGTTGTCACA GGTCTTGTTA TTGGGGTTTT TATAATCCAC 900 

AGAAACACTC CAATTGTGAA GGCCAATAAT AGATCTCTCA GTTATATCCT GCTCATCACT 960 

CTCACTCTCT GTTTCCTTTG TCCCTTGCTC TTCATTGGGC TTCCAAACAC AGCCACATGT 1020 

ATCCTACAGC AGAACTTGTT TGGACTTCTC TTCACTGTGG CTCTATCCAC AGTGTTGGCC 1080 

AAAACTATCA CTGTAGTTAT GGCATTCAAG ATTACTGCTC CAGGAAGAAA GACAAGATGG 1140 

TTGCTGATAT TAAGAGCCCC TCAGTTCATC ATTCCACTTT GTGCCCTGAT GCAAATCCTT 1200 

TTCTCTGGGA TATGGCTGGG AACATCTCCT CCATTTGTTG ACATGGATGC TCACTCTGAA 1260 

CATGGGCACA TCATCATTCT ATGCAACAAG GGCTCAGCTA TTGGCTTCTA CTGTACTCTG 1320 

GCCTACCTGG GAGTCATGGC CTTTGGTAGT TACCTCTTGG CTTTCATGTC CAGGAATCTT 1380 

CCTGACACAT TTAATGAATC CAAGGCCCTG GCTTTCAGCA TGCTGATGTT CTGCAGTGTC 1440 

TGG GTCA CAT TCCTCCCTGT CTACCACAGC ACCACTGGGA AGGTCAGGGT GGCTATGGAA 1500 

ATGTTTTCTA TCTTGGCTTC CAGTGCAAGC ATTCTAACCC TAATCTTTGT CCCTAAGTGC 1560 

TACATTGTTT TGTTCAGACC AGAGAGGAAC ATACTTCCTC TAAACAGAGA AAAAAGACAG 1620 

CATAGGAGTA AAAATTCTGA AACA 1644 

(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2304 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 

A TGGA GGAAA TCAACAGGAA CCCTGATCTT TTACCAAATA TGTCTTTGGT TATAAAACAT 60 

ACTTTGAGCT ATTGTGATGG AAATACTGCA GACCATATAT TTAAAGAAAA ATTTTATAAG 120 

CCTTTACCTA ATTATGTCTG TAATGAAGAG ACTATGTGTT CATTTATGCT TATAGGGCTG 180 

AATTGGGTAT TGTCTCTAAC ACTTTTTAAA GACTTGGACA TCTTCTCATT TCCACGTTTC 240 

CTTCAAATTT CCTATGGACC TTTCCATTCC ATCTTCAGTG ATAATGAACA ATTTCCATAT 300 

CT CTATCAG A TGACCCCAAA GGACACATCA CTAGCATTGG CAATTGTCTC CTTCTTACTT 360 

TACTTCAATT GGAACTGGGT TGGGCTTGTC ATCTCTGATA ATGATGAAGG CAATCAATTT 420 

CTCTCAGAGT TGAAAAAAGA GACCCAAAAC AAGGAAATTT GCTTTGCCTT TGTTAACATG 480 

ATGTCAATCC ATGAGCATTC ATCTTATCAA AAAACTGAAA TGTACTACAA TCAAATAGTG 540 

ATGTCATCAA CAAATATTAT TATCATTTAT GGGAAAACAA ACAGTATCAT TGAATTGAGC 600 

TTCAGAATGT GGGTATCTCC AGTTATACAG AGGATTTGGG TCACAAACTC AGAGTTGGAT 660 

TTCCCGACAA GTATGAGAGA CTTCACTCAT GGCACATTCT ATGGGACTCT GACATTTCTA 720 

CACCACCATG GTGAGATTTC TGGATTTACA AATTTTTTCG AGACATGGGA CCATCTCAGA 780 

AGCAGAGATT TAAATCTATT AATACCAGAG TGGAAGTACT TTAGCTATGA TGCCTCAGGA 840 
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TCTAACTGTA AAATATTGAG OAACTATTCA TCCAATGCCT CATTGGAATG GATAACAGAA 900 

CAGAAGTTTC ACATGGCCTT TAATGATTAT AGTCATAGTA TATATAATGC TGTGTATGCC 960 

ATGGCCCATG CCCTCCATGA GACTAATCTG CAAGAGGTTG ATAATAAGGA AATAAGAAAT 1020 

GGGAAAGGAG CAAGTACTCA CTGCTTGAAG GTAAACTCAT TTCTCAGAAA GACCCACTTT 1080 

ACTAATTCTC ATGGAGAGAG AGTGATTATG AAACAGAGAG TGAGAGTACA GGAAGACTAT 1140 

GACATTGTTC ACATTCAGAA TTTCTCACAA CACCTTCGGA TTAAGATGAA GATAGGAAAG 1200 

TTCAGCCCAT ATTTTACACA TGGTGGACCC TTTCACTTAT ATGAAGACAT GATTCAGTTG 1260 

GCCACAGGAA GTAGAAAGAT GCCGTCCTCT GTGTGCAGTG CAGATTGTAG TCCTGGATTC 1320 

AGAAAATCCT GGAAGGAGGG AATGGCCCCC TGCTGTTTTA TTTGCAGCCT GTGCCCTGAA 1380 

AATGAAATTT CTAATGAGAC AAATATGGAT CAATGTGTGA ATTGTCCAGA ATACCAATAT 1440 

GCCAACACAG AAAAGAACAA ATGCATTCAG AAAGACGTGA TTTTTCTAAG CTATGAAGAC 1500 

CCCTTGGGAA TGGCTCTTGC CTTAATTGCC TTCTGTTTGT CTGCATTCAC AGCTGTGGTA 1560 

CTTTGGGTCT TTGTGAAGCA CCATGACACT CCTATTGTGA AGGCCAATAA CAGAATCCTC 1620 

AGCTACATAT TAATCATGTC ACTAATGTTC TGTTTTCTCT GCTCCTTTTT CTTCATTGGC 1680 

CATCCTAACA GAGGTACCTG TATCTTACAG CAAATCACAT TTGGCATTGT ATTCACTGTG 1740 

GCTGTTTCCA CAGTTCTGGC CAAAACAATC ACTGTCATTC TTGCTTTCAA ACTCAGAGAC 1800 

CCAGGGAGAA GTTTAAGAAA CTTCCTGGTA TCTGGTGCAC CCAACTACAT TATTCCTATA 1860 

TGTTCCTTAT TGCAATGTAT TCTGTGTGCA ATTTGGCTAG CAGTTTCTCC TCCTTTTGTT 1920 

GATATTGATG AACATTCTGA GCATGGCCAC ATCATGATTG TGTGCAACAA GGGCTCCATT 1980 

ATGGCATTCT ACTGTGTCCT AGGATACTTG GCCTGCCTGG CGCTTGGAAG CTTCACTACA 2040 

GCTTTCTTGG CAAAGAATCT GCCAGACACA TTCAACGAAG CCAAGTTCTT GACCTTCAGC 2100 

ATGCTAGTGT TCTGCAGTGT CTGGGTCACC TTTCTCCCTG TGTACCATAG CACAAGGGGC 2160 

AGGGTCATGG TTGCTGTTGA GATCTTCTCT ATCTTGGCAT CCAGTGCAGG GATGTTTGGA 2220 

TGCATCTTTG CACCCAAAAT CTACATCATA TTAATGAAAC CAGAAAGAAA TTCTATACAA 2280 

AAGTTCAGGG AGAAATCATA TTTC 2304 

(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2001 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 

ATGGCTCCTA AGGACACATC TCTGGCACTG GCCATGGTTT CTTTGTTTGT CCATTTCAGC 60 

TGGAACTGGG TAGGAGCTGT TGTTTCAGAT GATGACCCAG GTTATGAATT TATCTTGGAA 120 

TTGAGAAGAG AAATGCAAAG GAACAATTTT TGTTTAGCAT TTGTGAGTAT CATTGTTAGT 180 

GATGACAATT TATTTCTGAA AAGGTATAAT ATCTATTACA ACCAGATCAA GATGTCATCA 240 

GCAAAAGTTG TTATCATTTA TGGAGACAAA GACTCTCCTC TACAGGTGAA CTTTAGACTA 300 

TGGAATTTAT TTGATATCCA AAGAATCTGG GTCACTACTT CACAGTGGGA TATGATCATA 360 

AATAATGGAA AATTCCTCCT TAATTCCTTC TATGGGACTC TCAGTTTTTC ACATCACTAT 420 

TCTGAATTAT CTGGTTTTAA AACATTTATC CAGACAGCAT ACCCTTCAAA CTACAGTGAT 480 

GACTTTTCTC TTGGTATATT ATGGTGGGTG TATTTTAATT GTTCTTTGTC ATTATCTGAA 540 

TGTAAGAATC TGCAAAATTG TCCAAAGGAA AACATATTTA GATGGTTATA CAGGCACCAT 600 

TTTGAAATGT CTTTGAGTGA TACTACTTAT GACCTATATA ATTCTATGTA TGCTGTGGCT 660 

TACACACTCC AACAGATGCT TCTGAAACAA GCAGATACAT GGCAAATAGA TGATGGAAAA 720 

GAACCAGAAT TTGACTCTTG GCAGATGCTC TCTTTCCTGA GAAATATCCA ATTTATAAAC 780 

CCTGTTGGTG ACAAAGTGAA CCTGAATCAT GAAGAAAAAC TGGATACAAA GTATGAGATT 840 

CACCAGACTT TGACTTTTTT GCCAAATCCT GTATTTAAGC TGAAAATAGG AACATTTTCC 900 

CAAAACTTAT CACATGGTCG ACAATTATAT ATGTTGAAAG AAATGATAGA GTGGAACACA 960 

GGCCACCAAC AGTCTCCAAC CTCAGTTTGC AGTATTCCTT GTAGTCCAGG ATTCAGAAAA 1020 

TCCCCTCAGC TGGGAAAGCC TGTTTGCTGT TTTGATTGTA CACCCTGCCC AGAAAATGAA 1080 

ATTTCCAACA TGACAAACAT GAATCAATGT ATCAAGTGTC TAAATGATCA GTATGCCAAT 1140 

CCTGGAGGAA CTCGCTGCCT CAAAAAAGTT ATTGTATTCC TGGGTTATGA AGATCCATTG 1200 

GGAATGTCTC TGGCTATCTT GGCTCTGTGC TTCTCTGCTC TCACAGCTTT TGTACTTAGT 1260 

ATCTTTTTGA AGCACCAAGA AACACCCACT GTCAAGGCCA ATAATAGAAC TCTCAGCTAT 1320 

GTTCTACTCA TCTCCCTCAT CTCTTGTTTT CTCTGCTCCT TGCTCTTCAT TGGTCATCCC 1380 

AG CTTTACCA CATGTATCAT GCAGCAGACC ACATTTGCTG TTGTGTTCAC TGTAGCTGCA 1440 

TCTACTGTCT TGGCCAAAAC AATTATTGTA ATATTGGCCT TCAAGGTTAC TAATACAAGT 1500 

AGAAAAATGA GGTGGCTGCT GGTATCAGGG GCACCTAAAT TCATCATTCC AATTTGCACA 1560 

ATGATTCAAC TGATTCTCTG TGGAATTTGG CTGGGTACTT CTCCTCCATT TGTTGATGCT 1620 
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GATGGACATG TTGAAAAAGG CCACATTTTG ATTTTCTGTA ACAAAGGTTC AATTCTTGCT 1680 

TTCTATTGTG TCCTGGGATA CTTAGTCTCC ATTGCCATTG CAAGTTTCAC CCTTGCATTC 1740 

TTCGCCAGAA ATCTGCCCGA CACATTCAAT GAAGCCAAGT TCCTAACATT CAGTATGCTA 1800 

GTATTTTGCA GTGTCTGGGT CACCTTTCTT CCTGTCTATC ATAGCACCAA GGGCAAGTCT 1860 

ATGGTGGCTG TGGAAGTTTT CTGTATATTG GCCTCTAGTG CAGGGCTGCT TTTTTGCATC 1920 

TTTGCACCAA AGTGCTTCAT TATTTTGTTA AGACCTGAGA AAAAATCTTT TCAGAAGTTT 1980 

CAGAATATAC ATTCTAAAAT T 2001 

(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2598 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 

ATGTCCAGGC TCAGAGCAGG AAAAAATATG CTCACCTTCA TTTTACTCTT CTTTCTCCTG 60 

AACATTCCAC TTTTTGTGCC TAGTTTTATT TATCCCAGGT GCTTTTGGAG TATGAAGAAG 120 

AATGAATATC AGGATAGAAA CCTGGGAACA GGTTGTATGT TCTTTATTCT AGCAGTGCAA 180 

CAGCCTATGG AAAAAGAGTA TTTCAGTCAT ATTTCGAATA TACAAACACC TACTGAAAAC 240 

CAAAAGTATC CTCTCACCTT GGCTTTTTCC ATGAATGAAA TCAACAACAA CCCTGATCTT 300 

TTGCCAAATA TGTCTTTAGC ATTTACATTC TCAGAATATA GTTGTTATTT GGAATCCCAC 360 

CACAAAAGAT TATTTAATTT TTCTTTAAAA AATCATGAAA TTCTCCCTAA TTTTATCTGT 420 

ACAAAAGACA TCAAGTGTGG AGTGGTACTT ACCGGACTTA GTTTGGTAAC AACTGTGACA 480 

CTTCATATAA TCCTAAACAA TTTCATATTT CAGCAGTTCC GTCAGCTTAC TTATGGACAC 540 

TTTCATCCTG CTCTGTGTGA TCATGAAAAT TTTCCTCATC TATATCAGAT GGCCTCTGAT 600 

GATACATCTC TAGCCCTTGC TCTCGTCTCC TTCATAATTC ATTTCAGTTG GAACTGGATA 660 

GGGTTGGCCA TCTCAGACAA TGATCAAGGC ATACATTTTC TCTCTTATTT GAGAAGAGAG 720 

ATGGAAAAAA ATACAGTCTG CTTTGCCTTT GTCAACATTA TTCCAGTCAA TATGAATTTA 780 

TACATGTCAA GAGCTGAAGT GTATTACAGC CAAGTTATGA CATCATCCGC AAATGTTGTT 840 

ATCATTTATG GTGATACAGG GAATACGTTA GCTGTGAGCT TTAGAATGTG GGACTCTCTA 900 

GGTATACAGA GACTATGGGT CACCACCTCA CAGTGGGATG TCACTCCTTT TAAGAAAGAC 960 

TTCACATTTG ATAATGGATA TGGAACTTTT GGTTTTGGAC ACCGCCACAG TGAGATTTCT 1020 

GGTTTTAAAT ATTTTGTTCA GACATTGAAC CCTTTCAAAT ACTCAGATGA ATATTTGGTA 1080 

AAGCTGGAAT GGATGTATGT TAATTGTAAA ATCTTAGAAT ATAACTGTAA GTCACTGAAG 1140 

AACTGCTCCT TTAATCACTC ATTGGAATGG CTAATGACAC ATACTTTTGA CATGGCCATT 1200 

ATTGAAGGGA GTTATGAAAT ATACAATGCT GTGTATGCTT TTGCCCATGC ACTCCATGAG 1260 

ATGACTCTTC AAAATGTTGA TAATGTTCTC CTTCCCAATT ATGAAGAACA AAATTATAAT 1320 

TGCAAGATGG TTTATTCCTT TCTGAGCAAG ACTCAATTCA CAAATCCTGT TGGAGACACT 1380 

GTGAATATGA ATCAAAOAAA CAAACTGAAG GAAGAGTACG ACATTTTCTA CAATTGGAAT 1440 

TTTCCACAGG GACTTGGATT TAAAGTGAAA ATAGGAATAT TTAGTCCATA TTTTCCAAAA 1500 

GGTCAACAGC TTCATTTATC TGAAAATCTG ATAGAGTGGT CCACAGGACG TATACAGATG 1560 

CCAACCTCTG TGTGCAGTGC CGATTGTGGT CCTGGATTTA GGAAAGTCTG GAAGAATGGA 1620 

ATGCCAGCCT GTTGTTTTGA CTGCAGTCCC TGCCCAGAAA ATGAAATTTC TAATGAGACA 1680 

AATGTGGAAT TGTGTGTCCA GTGTCCAGAG GACCAATATG CTAACCAAGA GCAGAATCAC 1740 

TGCATTCACA AAGCTCGTAT CTTTCTCTCT TATGATGAAC CCTTGGGGAT GGCTCTTTCC 1800 

TTAATGGCCT TATGCCTCGC TGCACTCACA GTTGTGGTTC TTGGAGTCTT TGTGAAACAT 1860 

CACAGAACTC CCATAGTTAA GGCCAATAAC TGCACTCTCA CCTACATCTT GCTCATCGCA 1920 

CTCATCTTTT GTTTCCTCTG CCCCTTGTTC TTCATTGGCC ATCCAAACTC AGCTACCTGC 1980 

ATCCTTCAGC AAATCACATT TGGAGTTGTG TTCACTGTGG CTATTTCCAC TGTGTTGGCC 2040 

AAAACAACCA CTGTCATTCT GGCTTTCAGA GTCACAGCCC CTCATAGAAT GATGAAGTAC 2100 

TTTCTTGTTT CAAGGGCATC TAACTACATC ATTCCCATTT GTACTCTCAT TCAAATTATT 2160 

GTATGTGCCA TCTGGCTAGG AGCTTCTCCT CCTTCTGTTG ATATTGATGC ACAGTCTGAG 2220 

CATGGTCACA TCATCATTGC TTGCAACAAG GGTTCAGTCA CTGCTTTTTA CTGTGTCCTG 2280 

GGATATCTGG CCTGCCTGGC CTTTGTGAGC TTCACCCTGG CTTTCCTTTC CAGAAACCTG 2340 

CCTGTCACCT TCAATGAAGC CAAGTCCATG ACATTCAGCA TGCTGGTGTT CTGCAGTGTC 2400 

TGGGTCACTT TCCTACCTGT TTACCATGGC ACCAAAGGCA AGGTTATGGT GGCTGTTGAG 2460 

ATCTTTTCCA CCTTGGCTTC TAGTGCAGGA ATGTTGGGAT GCATTTTTGC TCCAAAATGC 2520 

TACACAATAC TGTTTAGACC AGACAGAAAT TCTCTTCAAA TGATCAGGGA GAAGTCATCT 2580 

TCTCATACTC ACATTTTA 2598 



WO 99/00422 



PCT/US98/13680 



- 190- 



(2) INFORMATION FOR SEQ ID NO: 88: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2337 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 



ATGAGGTTTG CCATTGAGGA AATCAACAGC AATCCCCATC TTTTACCAAA CACATCCCTG 60 

GGATTTGAGA TCAATAATGT CCCACACGGT CAGAGGTACA CTCTGGTCAA ACTTTTTAGC 120 

TCACTTTCAG GGTCTAATTA TGACATTCCT AACTACATAA GTGCAAGTGA GAGCAATTCT 180 

GCTGCTGTAC TTACAGGACC ATCGTGGACA ATATCTGAAT GCGTAGGGAC ACTCCTGGAT 240 

CTTTACAAAT TTCCACAGCT TACTTTTGGG CCTTTTGATA GTCTCCTGAG TGAACAAAGA 300 

CGGTTTTCTT CTCTGTACCA AGTGGCCCCC AAAGATACAT TTCTGACGCC TGGCATTGTA 360 

TCTTTGATGC TTCATTTCCA CTGGAACTGG GTGGGGTTAT TCATCATAGA TGATGACAAA 420 

GGTGCCCAGA GACTGTCAGA CTTGAGAAAT GAGATGGATA AAAATGGAGT CTGCACAGCA 480 

TTTGTAGAAA TGATCCCAGT CATCAAGGGT TCATTTTTTA CCAAATCCTG GAAAAATCAT 540 

GTGCAGATCC TGGAATCATC ATCAAATGTG ATTATTATTT ATGGGGACTC TGATTCTCTA 600 

TTAAGCTTAA TAGTAAATAT TAAGCAGAAG TTGCTCACAT GGAAAGTGTG GGTACTGATC 660 

TCACAGTGGG ATGTTTCTAA ATTTGATGAT TATTTCATGG TAGACTCATT GCATGGAGCT 720 

CTTATTTTTT CACACCATCG TGAGGAGATT CCTAATTTTA CAGATTTTAT GCAGAAGTAC 780 

AACCCTTCCA AGTACCCGGA AGACACTTAT CTTCATGTAT TGTGGCACAT GTACTTCAAT 840 

TGCTCATTTG TTAAGAAAGA TTGTAAAATT GTGCACAACT GTTTGCCTAA TGCCTCCCTG 900 

GGGTTCTTGC CTGGGAACAT ATTTGACATG GCCATGAGTG AAGAGAGTTA CAATGTATAC 960 

AATGCTGTGT ATGCTGTGGC CCACAGTCTG CATGAGATGA TTCTCAACCA AGTACAATTT 1020 

CAAACTCATG AAAAAGGAAA AAAGATGGTA TTCTTTCCTT GGCAGCTTCA CCCCTTTCTA 1080 

AGGGAAAGAC AACTCATCAA TCAGAATGGA GCGAATGAAG ATCTGGATTG TACCAGGAAG 1140 

TCACATGTAG AGTATGACAT TCTCAACTTT TGGAATTTCC CAAAAGGTCT TGGGCTAAAT 1200 

GTGAAAGTAG GAACGTTTTC TCCAAGTGCT CCAAAGGAAC AGAAACTGTC CATATCTTCT 1260 

AACATGATAC AGTGGGCCAC AGGGTCGACA GAGATTCCAC AGTCTGTATG CAGTGAGAGC 1320 

TGTCATCCTG GATTCAGGAA AACCCACCAG GAAGGCAGGG TTGCCTGTTG CTTTGACTGC 1380 

ATTCCTTGTC CAGAAAATGA GATCTCCAAT GAGACAGATG TGGATCAGTG TGTGAAGTGT 1440 

CCAGAAACTC ACTATGCAAA CATAGAGAAG ATCCACTGCC TACAGAAAAC TGTGACATTT 1500 

CTGTACTATG ATGACCCATT GGGGAAGACA CTTTGCTTCA TGTCCCTGGG TTTCTCCTCA 1560 

CTCACAGCTG CTGTTCTTGT GGTGTTTCTG AAGAACAGGG ACACCCCCAT TGTCAAGGCC 1620 

AATAACCTGG CTCTCAGTTA CACCCTGCTC ATCACTTTGA TGCTCTGTTT TCTCTGTCCC 1680 

TTGCTCTTCA TTGGCCGTCC CAGCACAGCC TCCTGTATCC TGCAGCAAAA CATTTTTGGG 1740 

CTTCTGTTCA CTGTGGCTCT TTCCACTGTG TTGGCCAAAA CTATCACTGT GGTTATAGCC 1800 

TTCAAGATCA CTTCTCCAGG AAGAATTAGA AGATGGCTGC TGATATCAAG GGCCCCTAAT 1860 

TTCATTATTC CCTTATGCAC CCTGCTCCAA GTTTTTCTAT CTGGAATTTG GCTGACAACC 1920 

TCTCCTCCAT TTATTGATAA AGATG CTCAC TCAGAACATG GACACATCAT CATCATTTGC 1980 

AATAAAGGCT CAGCTGTTGC TTTCCATTGC AACCTTGGAT ACCTGGGAGC ACTAGCCCTA 2040 

GTGAGCTACT TTATGGCTTT CTTGTCCAGA AACCTACCTG ACACATTCAA TGAAGCCAAG 2100 

TTCCTGGCTT TCAGCATGCT GGTGTTCTGC AGTGTCTGGG TCACCTTCCT CCCTGTCTAC 2160 

CACAGCACCA AGGGGAAGAA CATGGTGGCT ATGGAAGTCT TCTCTATCTT GGCTTCCAGT 2220 

ACATCTCTCC TAGGCATCAT CTTTGCCCCC AAGTGCTACC TCATATTATT AAGACCAGAA 2280 

AGGAATTCAC TTAGCTATAT CAGGGACAAA ACATATGCTA AAAGCATAAA ACCTTCT 2337 



(2) INFORMATION FOR SEQ ID NO: 89: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1650 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 



ATGAAGTTAA GGGATAAAGA CTTGAGCATA ACTTGTTCCT TCATCCTTGA AGCAGTTCAG 
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ATGCCTACGG AAAACGATTA TTTCAACCAG ACTCTGAATA TCCTAAAAAC AACAAAAAAC 120 

CACAAATATG CTTTGGCATT GGCCTTTTCA ATTGATGAAA TCAACAGGAA TCCTGATCTT 180 

TTACCAAATA TGTCTTTGAT CATAAAATAC CCTTTGGGCC TTTGCGATGG ACAAACTACA 240 

TTACCTACAC CCTATTTATT TAATGAAATA TATTTTAGGC CTATCCCTAA TTATTTCTGT 300 

AATGAAGAGA CTATGTGTAC ATTTCTACTT ACAGGACCGC ATTGGATAAC ATCTTATAGT 360 

TTCTGGATAC ACTTGAACAT CTTCTTATCT CCTAGTATGA ACCCAAAGGA CACATCCCTA 420 

GCTTTGGCAA TGGTCTCCTT CTTACTTTAT TTCAAGTGGA ACTGGGTCGG CCTTGTCATC 480 

TCAGATGATG ATCAAGGCAA TCAATTTCTC TCTGAGTTGA AAAAAGAGAG CAAAATCAAG 540 

GAAATTTGCT TTGCATTTGT GAGCATGCTG GCAATCGATG AGATTTCATT TTATCATAAA 600 

ACTGAAATGT ACTACAACCA AATTGTGATG TCATCCACAA ACGTTATTAT CATTTATGGG 660 

AAAACAGAGA GTATTATTGA GTTGAGCTTC AGAATGTGGG AATCTCCAGT TATCCAGAGA 720 

ATATGGGTCA CCACAAAAGA AATGAATTTC CCTACCAGTA AGAGAGATTT AACTCATGAC 780 

ACATTCTATG GGACTCTTAC TTTTCTACAC AGCCATGGGG AGATTTCAGG CTTTAAAAAT 840 

TTTGTACAGA CATGGTACCA TCTTAGAATC ACTGATTTGC ATCTAGTAAT GCCAGAGTGG 900 

AAATATTTTA ACTATGAAGC CTCAGCATCT AACTGTAAAA TATTGAAGAA CTATTCATCC 960 

AGTGCCTCAT TGGAATGGTT AATGGAGCAG ACATTTGACA TGGTCTTTAG TGATGGAAGT 1020 

CGGGATATAT ATAATGCTGT AAATGCCATG GCCCATGCAC TCCATGAGAT GAATCTGCAC 1080 

CTGGTTGATA ATCAGGCAAT AGACAATGGG AAAGGAGCCA GTTCTCACTG CTTTAAGATA 1140 

AACTCCTTTC TCAGAAAGAC CCACTTCACT AATCCTCTTG GGGACAGAGT GATTATGAAA 1200 

GAGAGAGAAA TACTGCAAGA AGACTATAAC ATTTTTCACA CTTGGAATTT TTCTCAGCAC 1260 

ATTGGTTTTA AGGTGAAGAT AGGAAAGTTC AGCCCATATT TTCCACATGG CAGGCACTTT 1320 

CACCTATATG TAGACATGAT TGAGTTGGCT ACAGGAAGTA GAAAGATGCC ATCCTCTGTG 1380 

TGCACTGAAG ATTGTAGTCC TGGATACAGA AGATTCTGGA AGGAGGGAAT GGCAGCCTGC 1440 

TGTTTTGTTT GCAGTCCCTG CCCTGAAAAT GCAATTTCTA ATGAGACAAA TATGGATCAG 1500 

TGTGTGAATT GTCCAGAATA CCAATATGCC AATACAAAGC GGGACAAATG CATTCAGAAA 1560 

AATGTGATGT TTCTAAG CTA CAAAGACCCC CTTGGGGATG ACTCTTGCCT TCATAGCCTT 1620 

CTTTTTCTCT GCATTAACAG CTGTTGTACT 1650 

(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 
<A) LENGTH : 2379 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

ATGATAGTAT TCTTTCTCCT CAACATTCCA CTTCTCATGG CAAATTCCGT TGATCCCAGG 60 

TGCTTTTGGA AAATAAATTT GAATGAAGTC AAGGATATAG ATTTAGATAC AAGTTGTTAC 120 

TTCATCCTTG AGGCAGTTCA GTTGCCTATG GAGAAAGATT ATTTCAACCA GACTCTGAAT 180 

GTCCTAAAAA CAACCAAATA CAACAGATAT GCATTGGCAT TAGCCTTTAC AATGGATGAA 240 

ATAAACAGGA ATCCTCATAT TTTACCAAAC ATGTCTTTGA TTATAAAACA TACATTGGGC 300 

CACTGTGATG GAAATATCCC ACTCCGCTTA CTTAATCAAA TATTTTATAT GCCTTTTCCT 360 

AATTATGGCT GTAATGAAGA GACTATGTGT TCATTTATGC TTATGGGACC GAATTTGTGG 420 

CCATCTGTAG ATTTTTTCAT TCACTTGAAC ATCTTATTTC CTCATTTCCT TCAGATTTCC 480 

TTCGGACCTT TCCATTCCAT TTTCAGTGAT AATGAACAAT TTCCTTATAT CTATCAGATG 540 

ACCCCAAAGG ATACATCACT AGCATTGGCA ATGGTCTCTT TCATACTTTA CTTCAACTGG 600 

AACTGGGTTG GTCTTGTCCT CTCAGATAAT GATGAAGGCA ATCAATTTCT CACAGAGTTG 660 

AAAAAAGAGA CCCACAACAC GGAAATATGC TTTGCCTTTG TGAACATGAT GGCAATCAAT 720 

GAGAATTCAT CCATGAAAAA AACTGACATG TACT ACAAC C AAATTGTGAT GTCAACCGCA 780 

AATGTTATTA TCATTTATGG GGAACGACCC AGTATTATTG AACTGTGTTT CAGAACATGG 840 

ACATCTCCAG TCATACAGAG GATATGGGTT ACCAAATCAG AGTTGTATTT CCCAACAAGT 900 

AAGAGAGACT TAAGTCATGG AACATTCTAT GGAACTCTAG CATTTCAACA ACACCATGAT 960 

GTGATTTCTG GATTTAAAAA TTTTGTACAG ACATGGTACC ATCTCAAAAG CATGGATTTA 1020 

TATTTATTAA AGCCAGAGTG GGGTTTCTTT GAATATGAAA CCTCAGCATC TTACTGTAAA 1080 

ATACTGATGA GTAATTCATC GAATGTCTCA TTGGAATGGC TAATGGAACA GAAGTTTGAC 1140 

ATAGCCTTTA ATGACAATAG TCATAGTATA TACAATGCTG TGTACGCCAT GGCCCATGCT 1200 

CTCCATGAAA AGAATCTGAA ACAAATTGAT AATCAGGAAA TCAGCTATGG CAAAGGAGCA 1260 

AGTACTCACT GCTTGAAGTT ACACTCATTT TTGAGAACGA TCCACTTCAC CAATCCTTTT 1320 

GGGGAGAGAG TGATTATGAA AGAGAGAGTA AGAGTGCAGG AAGACTATGA CATTGTTCAC 1380 

CTGCAGAACT GCTCACAACA C CTTAGG ATT AAGGTGAAGA TAGGGCAGTT CAGCCCATAT 1440 

TTTCCACATG GTGGACAATT TCACTTATAT GAAGACATGA TTGATTTGGC CACAGGAAGT 1500 
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AGAAAGATGC CTTTATCTAT GTGTAGTGCA GATTGTCGTC CTGGATACAG AAAATTCTGG 1560 

AAGGAGGGAA TGGCAGCCTG CTGTTTTGTT TGCAGTCCCT GTCCAGACAA TGAAATTTCT 1620 

AATGAAACAA CTGTGGTACT TTGGGTCTTT GTGAAGCACC ATGACACTCC TATTGTGAAG 1680 

GCCAATAACA GAATCCTCAG CTACATATTA ATCATGTCAC TCATGTTCTG CTTTCTGTGC 1740 

TCCTTTTTCT TCATTGGCCA TCCTAACAGA GGTACCTGTA TCTTACAGCA AATCACATTT 1800 

GGAATTGTAT TCACTGTGGC TGTTTCCACA GTTCTGGCCA AAACAATCAC TGTGCTTCTG 1860 

GCTTTTCAAG TCACAGACAC AGGAAGAAAG TTAAGAAACT TCCTGGTATC GGGGACACCC 1920 

AACTACATTA TTCC CATATG TTCCCTGTTG CAATGCACTC TGTGTGCAAT TTGGCTAGCA 1980 

GTTTCTCCAC CATTTGTTGA TATCGATGAA CATTCTGAGC ATGGTCACAT CATAATTGTG 2040 

TGGAACAAGG GATCTGTTAT GGCATTCTAC TGTGTCCTGG GATATTTGGC CTTCCTGGCC 2100 

CTTGGAAGTT TCACGATGGC TTTCTTGGCA AAGAATCTGC CTGACACATT CAATGAAGCC 2160 

AAGTTCTTGA CCTTCAGCAT GCTAGTGTTC TGCAGTGTCT GGATCACGTT CCTTCCTGTC 2220 

TACCATAGCA CCAAGGGCAG AGTCATGGTT GCTGTTGAAA TTTTCTCCAT TTTGACATCC 2280 

AGTGCAGGGA TGCTTGGATG CGTCTTTGCA CCCAAAATTT ACATCATTTT AATGAAACCA 2340 

GAGAGAATTC TATCCAAAAG ACAGGAGAAA TCACGTTTC 2379 

(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2394 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 

ATGGTAATAT TCTTCCTTCT CAACATTCCA TTTCTCCTGG CAAATTTCAT GGATCCCAGA 60 

TGCTTTTGGA AAATAAATTT GAATGAAATC AAGGATGAAG TCCTTGGGAT GACTTGTTCC 120 

TTCATCCTTG A7VACAGTTCA GAAGACTATG GACAAAGATT ATTTCAACCA GACTCTGAAT 180 

GTCCTAAATA CAACTACAAA C CACAAAT AT GCCTTGGCAT TGGCCTTTAC AGTGGATGAA 240 

ATCAACAGGA ATCCTGATCT TTTACCAAAT ATGTCTCTGA TTATAAAATA CAATTTGGGT 300 

CATTGTGATG GAAAAACTGT AACAACTCTA TCCGATTTAT TTAATCCAAA TAATCATCTC 360 

CATTTCCCCA ATTATTTATG TAATGAAGGG ATTATGTGTT TGGTTCTGCT TACA0GACCA 420 

CATTGGAGAG CATCTTTATA TCTCTGGATA TCCGTGTATG TCTACCTGTC TCCACATTTC 480 

CTTCAGCTTT CCTATGGACC TTTCTACTCC ATCTTCAGTG ATAATGAACA ATATCCTTAT 540 

CTCTATCAGA TGGGCCCAAA GGACTCATCA CTAGCATTGG CAATGGTCTC CTTCATAATT 600 

TACTTCAAGT GGAACTGGGT TGGGCTATTT ATCTCAGATG ATGATCAAGG CAATCAATTT 660 

CTCTCAGAGT TGAAAAAAGA GAGCCAAACC AAGGATATTT GCTTTGCCTT TGTGAACATG 720 

ATATCAGTCA GTGATGTTTC ATACTATCAT AAAACTGAAA TGTACTACAA CCAAATTGTG 780 

ATGTCATCCA CAAAGGTTAT TATCATTTAT GGGGAAACAA ACAGTATTAT TGAATTGAGC 840 

TTCAGAATGT GGTCATCTCC AGTTAAACAG AGAATATGGG TCACCACAAA ACAATTTGAT 900 

TGCCCTACCA GTAAGAGAGA CTTAACTCAT GGCACATTCT ATGGGACCCT TACATTTCTA 960 

CACCACTATG GTGAGATTTC TGGCTTTAAA AATTTTGTAC AGACACGGTA CAATCTCAGA 1020 

AGCACAGATT TATATCTAGT AATGCCAGAG TGGAAATATT TTAACTATGA AGCCTCAGCA 1080 

TCTAACTGTA AAATACTGAG AAACTATTTA TCCAATATCT CACTGGAATG GCTAATGGAA 1140 

CAGAAATTTG ACATGTCATT TAGTGATTAT AGTCACAACA TATACAATGC TGTATATGCC 1200 

ATTGCTCATG CACTCCATGA GAAGAATCTG CAAGAAGTTG AAAATCAGGC AATAAACAAT 1260 

GCGAAAGGAG AAAATACTCA CTGCTTGAAG CTAAACTCAT TTCTGAGAAA GACCCACTTC 1320 

ACTAATTCTC TTGGGAACAG AGTAATTATG AAACAGAGAG AAGTAGTGCA TGGAGACTAT 1380 

AATATTGTTC ACATGTGGAA TTTCTCACAA CGCCTTGGGA TTAAGGTGAA GATAGGACAA 1440 

TTCAGCCCAC ATTTTCCACA GGGTCAACAG TTACACTTAT ATGTAGACAT GACTGAGTTG 1500 

GCTACAGGAA GTAGAAAGAT GCCATCCTCA GTGTGCAGTG CAGATTGCCA TCCTGGATTC 1560 

AGAAGAATCT GGAAGGAGGA AATGGCAGCC TGCTGTTTTG TTTGCAACCC CTGCCCTGAA 1620 

AATGAAATTT CTAATGAGAC GATGGTGGTA TTTTGGGTCT TCGTGAAGCA CCATGACACT 1680 

CCTATTGTGA AGGCCAATAA CAGAATCCTC AGCTACCTAT TAATCGTGTC ACTCATGTTC 1740 

TGTTTTCTGT GCTCCTTTTT CTTCATTGGC TATCCTAACA GAGCAACCTG TATCTTACAG 1800 

CAAATCACAT TTGGAATCTT CTTTACTGTG G CTATTTCCA CAGTTCTGGC CAAAACAATC 1860 

ACTGTGGTTC TGGCTTTCAA AGTCACAGAC CCAGGAAGAC AATTAAGAAT CTTTTTGGTA 1920 

TCGGGGACAC CCAACTACAT TATTCCCATA TGTTCCCTAT TGCAATGTAT TCTGTGTGCA 1980 

ATCTGGCTAG CAGTTTCTCC TCCCTTTGTT GATATTGATG AACACTCTGA GCATGGCCAC 2040 

ATCATCATTG TGTGCAACAA GGGCTCCATT ACTGCATTCT ACTGTGTCCT GGGATACTTG 2100 

GCCTGCCTGG CCTTTGGAAG CTTCACTATA GCTTTCTTGG CAAAGAACCT GCCTGACACA 2160 

TTCAACGAAG CCAAGTTCTT GACCTTCAGC ATGCTAGTGT TCTGCGCTGT CTGGGTCACC 2220 
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TTCCTCCCTG TCTACCATAG CACCAAGGGC AAGGTCATGG TTGCTGTGGA GATCTTCTCC 2280 

ATCTTGGCAT CTAGTGCAGG GATGCTGGGA TGCATCTTTG CACCCAAAGT TTACATCATT 2340 

TTAATGAGAC CAGACAGAAA TTCGATCCAC AAAATCAGGG AGAAATCATA TTTC 2394 

(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2085 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 

GTCTACCTGT CTCCACATTT CCTTCAGCTT TCCTATGGAC CTTTCTACTC CATCTTCAGT 60 

GATAATGAAC AATATCCTTA TCTCTATCAG ATGGGCCCAA AGGACTCATC ACTAGCATTG 120 

GCAATGGTCT CCTTCATAAT TTACTTCAAG TGGAACTGGG TTGGGCTATT TATCTCAGAT 180 

GA TGA TCAAG GCAATCAATT TCTCTCAGAG TTGAAAAAAG AGAGCCAAAC CAAGGATATT 240 

TGCTTTGCCT TTGTGAACAT GATATCAGTC AGTGATGTTT CATACTATCA TAAAACTGAA 300 

ATGTACTACA ACCAAATTGT GATGTCATCC ACAAAGGTTA TTATCATTTA TGGGGAAACA 360 

AACAGTATTA TTGAATTGAG CTTCAGAATG TGGTCATCTC CAGTTAAACA GAGAATATGG 420 

GTCACCACAA AACAATTTGA TTGCCCTACC AGTAAGAGAG ACTTAACTCA TGGCACATTC 480 

TATGGGACCC TTACATTTCT ACACCACTAT GGTGAGATTT CTGGCTTTAA AAATTTTGTA 540 

CAGACACGGT ACAATCTCAG AAGCACAGAT TTATATCTAG TAATGCCAGA GTGGAAATAT 600 

TTTAACTATG AAGCCTCAGC ATCTAACTGT AAAATACTGA GAAACTATTT ATCCAATATC 660 

TCACTGGAAT GGCTAATGGA ACAGAAATTT GACATGTCAT TTAGTGATTA TAGTCACAAC 720 

ATATACAATG CTGTATATGC CATTGCTCAT GCACTCCATG AGAAAGATCT GCAAGAATTT 780 

GAAAATCAGG CAATAAACAA TGCGAAAGGA GAAAATACTC ACTGCTTGAA GCTAAACTCA 840 

TTTCTGAGAA AGACCCACTT CACTAATTCT CTTGGGAACA GAGTAATTAT GAAACAGAGA 900 

GAAGTAGTGC ATGGAGACTA TAATATTGTT CACATGTGGA ATTTCTCACA ACGCCTTGGG 960 

ATTAAGGTGA AGATAGGACA ATTCAG CCCA CATTTTCCAC AGGGTCAACA GTTACACTTA 1020 

TATGTAGACA TGACTGAGTT GGCTACAGGA AGTAGAAAGA TGCCATCCTC AGTGTGCAGT 1080 

G CAG ATTGCC ATCCTGGATT CAGAAGAATC TGGAAGGAGG AAATGGCAGC CTGCTGTTTT 1140 

GTTTGCAACC CCTGCCCTGA AAATGAAATT TCTAATGAGA CGAATATGGA TCAGTGTGCG 1200 

AATTGTCCAG AATACCAGTA TGCCAACAGA GAAAAGAACA AATGCATCCA GAAAGGTGTG 1260 

ATTGTTCTAA GCTATGAAGA CCCCTTGGGG ATGGCTCTTG CCTTAATAGC ATTCTGTTTC 1320 

TCTGCATTCA CAGTGGTGGT ATTTTGGGTC TTCGTGAAGC ACCATGACAC TCCTATTGTG 1380 

AAGGCCAATA ACAGAATCCT CAGCTACCTA TTAATCGTGT CACTCATGTT CTGTTTTCTG 1440 

TGCTCCTTTT TCTTCATTGG CTATCCTAAC AGAGCAACCT GTATCTTACA GCAAATCACA 1500 

TTTGGAATCT TCTTTACTGT GGCTATTTCC ACAGTTCTGG CCAAAACAAT CACTGTGGTT 1560 

CTGGCTTTCA AAGTCACAGA CCCAGGAAGA CAATTAAGAA TCTTTTTGGT ATCGGGGACA 1620 

CCCAACTACA TTATTCCCAT ATGTTCCCTA TTGCAATGTA TTCTGTGTGC AATCTGGCTA 1680 

GCAGTTTCTC CTCCCTTTGT TGATATTGAT GAACACTCTG AGCATGGCCA CATCATCATT 1740 

GT GTGC AACA AGGGCTCCAT TACTGCATTC TACTGTGTCC TGGGATACTT GGCCTGCCTG 1800 

GCCTTTGGAA GCTTCACTAT AGCTTTCTTG GCAAAGAACC TGCCTGACAC ATTCAACGAA 1860 

GCCAAGTTCT TGACCTTCAG CATGCTAGTG TTCTGCGCTG TCTGGGTCAC CTTCCTCCCT 1920 

GTCTACCATA GCACCAAGGG CAAGGTCATG GTTGCTGTGG AGATCTTCTC CATCTTGGCA 1980 

TCTAGTGCAG GGATGCTGGG ATGCATCTTT GCACCCAAAG TTTACATCAT TTTAATGAGA 2040 

CCAGACAGAA ATTCGATCCA CAAAATCAGG GAGAAATCAT ATTTC 2085 



We claim: 
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Claims 

1 . A family of pheromone receptor polypeptides, each of said polypeptides comprising from 
amino terminus to carboxyl terminus: 

5 (a) an ammo-terminal extracellular domain containing from 30 to 600 amino acids; 

(b) a transmembrane region comprising: 

(i) seven non-contiguous transmembrane domains designated TM1, TM2, TM3, 
TM4, TM5, TM6 and TM7 

(ii) three non-contiguous extracellular domains designated EC2, EC3 and EC4, and 
10 (iii) three non-contiguous intracellular domains designated IC1, IC2, and IC3, 

wherein the transmembrane domains, the extracellular domains and the intracellular 
domains are attached to one another from amino terminus to carboxyl terminus in the order TM1- 
IC1-TM2-EC2-TM3- IC2-TM4-EC3-TM5-IC3-TM6-EC4-TM7, and 

wherein the transmembrane region has at least about 35% homology and a length 
15 approximately equal to a transmembrane region of a polypeptide selected from the group 
consisting of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 34, 36, 38, 40, 42, 44, 46, 48, and 50; and 

(c) a carboxyl-terminal intracellular domain containing from 5 to 200 amino acids; 
wherein the pheromone receptor polypeptides are expressed in a Gcc G protein-expressing 

vomeronasal organ neuron or are expressed in another olfactory organ neuron in an animal which 
20 does not possess a vomeronasal organ. 

2. The polypeptides of claim 1, wherein the transmembrane region of each of said 
polypeptides has at least between about 60% and about 90% homology to the transdomain region 
of a pheromone receptor polypeptide selected from the group consisting of SEQ ID NO. 2, 4, 

25 6, 8, 10, 12, 14, 34, 36, 38, 40, 42, 44, 46, 48, and 50. 

3. The polypeptides of claims 1 or 2 , wherein the non-contiguous intracellular domains of 
each of said polypeptides has at least between about 60% and about 90% homology to the non- 
contiguous intracellular domains of a pheromone receptor polypeptide selected from the group 

30 consisting of SEQ ID NO. 2, 4, 6, 8, 10, 34, 36, 38, 40, 42, 44, 46, 48, and 50. 
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4. The polypeptides of claim 1, wherein the extracellular domain of each of said 
polypeptides has at least between about 50% and about 90% homology to the extracellular 
domain of a pheromone receptor polypeptide selected from the group consisting of SEQ ID NO. 
2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, and 50. 

5 

5. The polypeptides of claim 2, wherein the extracellular domain of each of said 
polypeptides has at least between about 50% and about 90% homology to the extracellular 
domain of a pheromone receptor polypeptide selected from the group consisting of SEQ ID NO. 
2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, and 50. 

10 

6. The polypeptides of claim 3, wherein the extracellular domain of each of said 
polypeptides has at least between about 50% and about 90% homology to the extracellular 
domain of a pheromone receptor polypeptide selected from the group consisting of SEQ ID NO. 
2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, and 50 . 

15 

7. The polypeptides of claims 1 or 2, wherein the extracellular domain contains at least 
between about 50 and about 500 amino acids. 

8. The polypeptides of claim 3, wherein the extracellular domain contains at least between 
20 about 50 and about 500 amino acids. 

9. The polypeptides of claims 4, 5 or 6, further comprising a signal sequence attached to the 
amino terminus of the extracellular domain. 

25 10. The polypeptides of claim 9, wherein the signal sequence is selected from the group of 
signal sequences ofa pheromone receptor polypeptide of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 16, 
18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52. 

11. A method for identifying a nucleic acid encoding a pheromone receptor polypeptide, 
30 comprising: 

(1) contacting a mixture of nucleic acid molecules with at least one nucleic acid probe 
of a nucleic acid selected from the group consisting of: (a) a nucleic acid molecule selected from 
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the group consisting of SEQ ID NO. 1, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 
35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 54, and 55 that encodes a pheromone receptor polypeptide; 
(b) a unique fragment of (a); (c) a human homolog of (a) or (b); and (d) a set of degenerate 
primers of any of (a), (b) or (c); and 
5 (2) identifying the sequences within the mixture that hybridize to the probe. 

12. The method of claim 1 1, wherein the mixture is a genomic library. 

13. The method of claim 1 1 , wherein the mixture is a cDNA library. 

10 

14. The method of claim 1 1, wherein the nucleic acid probe contains a detectable label. 

15. The method of claim 11, wherein the at least one nucleic acid probe is a pair of 
degenerate polymerase chain reaction primers that amplify a unique fragment of a nucleic acid 

15 molecule selected from the group consisting of SEQ ID NO. 1, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 
23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 54, and 55, the method further 
comprising the step of subjecting the mixture to a polymerase chain reaction amplification 
reaction prior to selecting a member of the mixture which hybridizes to the nucleic acid probe. 

20 16. The method of claim 15, wherein the pair of degenerate polymerase chain reaction 
primers is selected from the group consisting of SEQ ID NOs. 60 and 61, SEQ ID NOs. 62 and 
63, SEQ ID NOs. 64 and 63, SEQ ID NOs. 64 and 65, and SEQ ID NOs. 66 and 67. 

1 7. The method of claim 1 6, wherein the pair of polymerase chain reaction primers is selected 
25 from the group consisting of SEQ ID NOs. 60 and 61, SEQ ID NOs. 62 and 63, SEQ ID and 

NOs. 64 and 63. 

1 8. An isolated nucleic acid molecule 

(a) which hybridizes under high or low stringency conditions to a molecule consisting 
30 of a nucleic acid sequence selected from the group consisting of SEQ ID NO. 1 , 3, 5, 7, 9, 1 1 , 
13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 54, and 55, and 
which codes for a pheromone receptor, 
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(b) nucleic acid molecules that differ from the nucleic acid molecules of (a) in codon 
sequence due to the degeneracy of the genetic code, and 

(c) complements of (a) and (b). 

5 19. The nucleic acid molecule of claim 1 8, wherein the pheromone receptor is expressed in 
the vomeronasal organ or is expressed in another olfactory organ in an animal which does not 
possess a vomeronasal organ. 

20. The nucleic acid molecule of claim 18, wherein the pheromone receptor is expressed in 
10 a Ga<, protein-expressing vomeronasal organ neuron. 

21 . The nucleic acid molecule of claim 1 8, wherein the pheromone receptor is a G-protein 
coupled receptor. 

15 22. The isolated nucleic acid molecule of claim 18, wherein the pheromone receptor has an 
amino acid sequence selected from the group consisting of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 16, 
18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52. 

23. The isolated nucleic acid molecule of claim 18, wherein the isolated nucleic acid 
20 molecule is selected from the group consisting of SEQ ID NO. 51, 53, 54, 55, 68, 69, 70, 71, 72, 

73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, and 92, that encodes a 
pheromone receptor polypeptide. 

24. The isolated nucleic acid molecule of claim 18, wherein the isolated molecule comprises 
25 a molecule having a sequence which encodes a pheromone receptor unique fragment, wherein 

said unique fragment is selected from the group consisting of a pheromone receptor extracellular 
domain, a pheromone receptor transmembrane domain, a pheromone receptor intracellular 
domain, a pheromone receptor extracellular domain coupled to at least one transmembrane 
domain, and at least one pheromone receptor transmembrane domain coupled to a pheromone 
30 receptor intracellular domain. 
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25. The isolated nucleic acid molecule of claim 18, wherein the pheromone receptor 
extracellular domain, the pheromone receptor transmembrane domain and the pheromone 
receptor intracellular domain have amino acid sequences selected from the group of sequences 
identified as these domains in SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 
5 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52. 



26. The isolated nucleic acid molecule of claim 18, wherein the unique fragment is selected 
from the group consisting of between 12 and 4000, between 12 and 2000, between 12 and 1000, 
between 12 and 500, between 12 and 250, between 12 and 100, between 12 and 50, and between 
10 12 and 25, nucleotides in length. 



27. An isolated nucleic acid molecule, ( gomprisi ng , 

(a) a molecule having a sequence selected from the group consisting of SEQ ID NO. 5 1 , 
53, 54, 55, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 

15 91, and 92, and which codes for a pheromone receptor; 

(b) nucleic acid molecules that differ from the nucleic acid molecules of (a) in codon 
, sequence due to the degeneracy of the genetic code, and 

\ (c) complements of (a) and (b). 

20 28. An expression vector comprising the isolated nucleic acid molecule of claims 18-27 
operably linked to a promoter. 

29. A host cell transformed or transfected with the isolated nucleic acid molecule of claims 
18-27. 

25 

30. A host cell transformed or transfected with the isolated nucleic acid molecule of the 
expression vector of claim 28. 

^31. An isolated polypeptide encoded by the isolated nucleic acid molecule of claims 1 8-27. 
30 ^ 

32. The isolated polypeptide of claim 31, wherein the isolated polypeptide has a pheromone 
receptor activity. 
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33. The isolated polypeptide of claim 3 1 , wherein the isolated polypeptide comprises 

a polypeptide selected from group consisting of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 16, 
18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52. 

34. The isolated polypeptide of claim 33, wherein the isolated polypeptide is a fragment of 
a peptide selected from the group consisting of an extracellular domain, a transmembrane 
domain and an intracellular domain, wherein the foregoing domains have amino acid 
sequences selected from the group of sequences identified as these domains of a 
pheromone receptor polypeptide selected from group consisting of SEQ ID NO. 2, 4, 6, 
8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52. 

35. A vaccine containing an isolated polypeptide selected from the group consisting of the 
isolated polypeptides of claim 31, 32, 33, and 34. 

36. A method for controlling fertility in an animal, comprising: 

administering to an animal in need of such treatment, an effective amount of the 
vaccine of claim 35 to elicit an immune response to the isolated polypeptide. 

37. An isolated binding polypeptide which binds selectively to a polypeptide of claim 1, 2, 
4, 5, 6, 8, 10, 31, 32, 33, and 34, provided that the isolated binding polypeptide does not 
bind to a G-protein coupled receptor other than a Ga 0 + -coupled pheromone receptor. 

38. The isolated binding polypeptide of claim 37, wherein the binding polypeptide binds to 
a polypeptide selected from the group consisting of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 
16, 1 8, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52: 

39. The isolated binding polypeptide of claim 37, wherein the binding polypeptide is an 
antibody fragment selected from the group consisting of a Fab fragment, a F(ab) 2 
fragment or a fragment including a CDR3 region selective for a pheromone receptor 
polypeptide. 
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40. The isolated binding polypeptide of claim 38, wherein the binding polypeptide is an 
antibody fragment selected from the group consisting of a Fab fragment, a F(ab) 2 
fragment or a fragment including a CDR3 region selective for a pheromone receptor 
polypeptide. 

5 

41. An affinity matrix comprising: 

a solid support to which is coupled an isolated binding polypeptide selected 
from the group consisting of the binding polypeptides of any of claims 37-40. 



10 42. A method for isolating a pheromone receptor, comprising: 

contacting a composition containing a putative pheromone receptor with the affinity 
matrix of claim 41 under conditions to permit the pheromone receptor to selectively bind to the 
binding polypeptides coupled to the solid support; and 

isolating the polypeptides that bind to the affinity matrix. 

15 

43. A composition comprising: 

the polypeptide of claim 1, 2, 4, 5, 6, 8, 10, 3 1, 32, 33, or 34; and 
a pharmaceutically acceptable carrier. 



20 44. A composition comprising: 

the nucleic acid molecule of any of claims 1 8-28; and 
a pharmaceutically acceptable carrier. 



45. A composition comprising: 
25 the binding polypeptide of claim 37; and 

a pharmaceutically acceptable carrier. 



46. A composition comprising: 

the binding polypeptide of claims 38, 39 or 40; and 
30 a pharmaceutically acceptable carrier. 



47. 



A method for modulating a pheromone receptor activity in a cell, comprising: 



WO 99/00422 PCT/US98/13680 

- 201 - 

administering to the cell an amount of the isolated binding polypeptide of claim 
37 effective to modulate pheromone receptor activity in the cell. 

48. A method for modulating a pheromone receptor activity in a cell, comprising: 

administering to the cell an amount of the isolated binding polypeptide of claim 
38, 39, or 40 effective to modulate pheromone receptor activity in the cell. 

49. The method of claim 47, wherein modulating a pheromone receptor activity comprises 
reducing the pheromone receptor activity. 

50. The method of claim 48, wherein modulating a pheromone receptor activity comprises 
reducing the pheromone receptor activity. 

5 1 . The method of claim 47, wherein the pheromone receptor activity is selected from the 
group consisting of a signal transduction activity and a ligand binding activity. 

52. The method of claim 48, wherein the pheromone receptor activity is selected from the 
group consisting of a signal transduction activity and a ligand binding activity. 

53. The method of claim 47, wherein the cell is a vertebrate cell, preferably a mammalian 
cell. 

54. The method of claim 48, wherein the cell is a vertebrate cell, preferably a mammalian 
cell. 

55. The method of claim 47, wherein the cell is an invertebrate cell, preferably an insect cell. 

56. The method of claim 48, wherein the cell is an invertebrate cell, preferably an insect cell. 

57. A method for reducing the binding of a pheromone having a binding domain to a 
pheromone receptor having a ligand binding site that selectively binds to the binding 
domain of the pheromone, comprising: 
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contacting the pheromone receptor with an agent which binds to the binding 
domain for a time effective to reduce binding of the pheromone to the ligand binding site of the 
pheromone receptor. 

5 58, The method of claim 57, wherein the agent is an antibody which binds to the binding 
domain. 

59. A method for decreasing pheromone receptor mediated signal transduction activity in a 
subject comprising: 

10 administering to a subject in need of such treatment an agent that selectively binds to 

an isolated nucleic acid molecule of claim 1 or an expression product thereof, in an 
amount effective to decrease pheromone receptor mediated signal transduction activity in the 
subject. 

15 60. The method of claim 59, wherein the agent is selected from the group consisting of an 
antisense nucleic acid and a binding polypeptide. 

61. A method for identifying lead compounds for a pharmacological agent useful in the 
diagnosis or treatment of disease associated with pheromone binding to a pheromone receptor 

20 polypeptide containing a ligand binding site that selectively binds to a binding domain of the 
pheromone, comprising 

forming a mixture comprising a pheromone receptor polypeptide or unique fragment 
thereof containing a ligand binding site, a molecule protein containing a binding domain which 
selectively binds the pheromone receptor ligand binding site, and a candidate pharmacological 

25 agent, 

incubating the mixture under conditions which, in the absence of the candidate 
pharmacological agent, permit a first amount of selective binding of the molecule containing a 
ligand binding domain by the pheromone receptor ligand binding site, and 

detecting a test amount of selective binding of the molecule containing the binding 
30 domain by the pheromone receptor ligand binding site, wherein reduction of the test amount of 
selective binding relative to the first amount of selective binding indicates that the candidate 
pharmacological agent is a lead compound for a pharmacological agent which disrupts selective 
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binding of a molecule containing a binding domain by a pheromone receptor containing a ligand 
binding site and wherein increase of the test amount of selective binding relative to the first 
amount of selective binding indicates that the candidate pharmacological agent is a lead 
compound for a pharmacological agent which enhances selective binding of a molecule 
5 containing a binding domain by a pheromone receptor polypeptide containing a ligand binding 
site. 
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AMENDED CLAIMS 
[received by the International Bureau on 1 1 December 1998 (1 1 .12.98); 
original claim 1 amended; remaining claims unchanged (1 page)] 

1 . A family of isolated phcromone receptor polypeptides, each of said isolated 
polypeptides comprising from amino terminus to carboxyl terminus: 
5 (a) an ammo-terminal extracellular domain containing from 30 to 600 amino acids; 

(b) a transmembrane region comprising: 

(i) seven non-contiguous transmembrane domains designated TM 1 . TM2, TM3 , 
TM4. TM5, TM6 and TM7 

(ii) three non-contiguous extracellular domains designated EC2, ECS and EC4, and 
1 0 (Hi) three non-contiguous intracellular domains designated IC 1 , IC2, and IC3 , 

wherein the transmembrane domains, the extracellular domains and the intracellular 
domains are attached to one another from amino terminus to carboxyl terminus in the order 
TM 1 -IC 1 -TM2-EC2-TM3- IC2-TM4-EC3-TM5-IC3-TM6-EC4-TM7, and 

wherein the transmembrane region has at least about 35% homology and a length 
1 5 approximately equal to a transmembrane region of a polypeptide selected from the group 
consisting of SEQ ID NO. 2. 4, 6, 8, 10, 12, 14, 34, 36, 38, 40, 42, 44, 46, 48, and 50; and 

(c) a carboxyl-terminal intracellular domain containing from 5 to 200 amino acids; 
wherein the phcromone receptor polypeptides are expressed in a Ga v protein- 
expressing vomeronasal organ neuron or are expressed in another olfactory organ neuron in an 

20 animal which does not possess a Vomeronasal organ. 

2. The polypeptides of claim 1 1 wherein the transmembrane region of each of said 
polypeptides has at least between about 60% and about 90% homology to the transdomain 
region of a phcromone receptor polypeptide selected from the group consisting of SEQ ID 

25 NO. 2, 4, 6, 8, 10, 12, 14, 34, 36, 38, 40, 42, 44, 46, 48, and 50. 

3. The polypeptides of claims 1 or 2 , wherein the non-contiguous intracellular domains 
of each of said polypeptides has at least between about 60% and about 90% homology to the 
non-contiguous intracellular domains of a phcromone receptor polypeptide selected from the 

30 group consisting of SEQ ID NO. 2, 4, 6, 8, 10, 34, 36, 38, 40, 42, 44, 46, 48, and 50. 
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